**Investigation of the Characteristics of NLS-PNA: Influence of NLS Location on Invasion E**ffi**ciency**

**Yuichiro Aiba \*,**† **, Gerardo Urbina** † **, Masanari Shibata and Osami Shoji \***

Department of Chemistry, Graduate School of Science, Nagoya University, Furo-Cho, Chikusa-Ku, Nagoya, Aichi 464-8602, Japan; gerardoaus@gmail.com (G.U.); shibata.masanari@b.mbox.nagoya-u.ac.jp (M.S.)

**\*** Correspondence: aiba.yuichiro@i.mbox.nagoya-u.ac.jp (Y.A.); shoji.osami@a.mbox.nagoya-u.ac.jp (O.S.)

† These authors equally contributed to this work.

Received: 1 November 2020; Accepted: 30 November 2020; Published: 3 December 2020 -

**Abstract:** Peptide nucleic acid can recognise sequences in double-stranded DNA (dsDNA) through the formation of a double-duplex invasion complex. This double-duplex invasion is a promising method for the recognition of dsDNA in cellula because peptide nucleic acid (PNA) invasion does not require the prior denaturation of dsDNA. To increase its applicability, we developed PNAs modified with a nuclear localisation signal (NLS) peptide. In this study, the characteristics of NLS-modified PNAs were investigated for the future design of novel peptide-modified PNAs.

**Keywords:** PNA; invasion; DNA; NLS

#### **1. Introduction**

The sequence-specific recognition of double-stranded DNA (dsDNA) has been an important research subject owing to its wide range of potential applications [1–7]. For this purpose, various methods have been developed, including those that make use of DNA-binding proteins [1,3–5,8–12], small molecules such as minor groove binders [13–16], and artificial DNA [17–24]. In addition, the recognition of dsDNA by peptide nucleic acid (PNA), an artificial nucleic acid mimic, has been reported by Nielsen et al. [7,21,24–31]. In 1991, PNA was first designed and synthesised as an analogue of natural nucleic acids, where the negatively charged sugar-phosphate backbone of DNA was substituted with an electrostatically neutral artificial *N*-(2-aminoethyl)glycine backbone (Figure 1a) [25]. Consequently, electrostatic repulsion between PNA and DNA is absent, enabling PNA to form more stable duplexes with complementary DNA via Watson–Crick base pairing than those between complementary DNA strands [31]. Furthermore, when pseudo-complementary PNAs (pcPNAs: see Figure S1)—where conventional adenine (A) and thymine (T) have been replaced by 2,6-diaminopurine (D) and 2-thiouracil (U)—are used, an effective invasion into dsDNA becomes possible, with two strands of pcPNA forming a double-duplex invasion complex (or, simply, invasion complex, Figure 1b) with dsDNA [26,32]. The introduction of D and U into PNA destabilises the formation of PNA/PNA duplexes through steric repulsions between the amino group of D and the thioketone group of U. The thermal stability of the PNA/DNA duplex is enhanced by the formation of stable base pairs between D/U and the natural nucleobases. PNA's unique DNA-recognition mode has been used in many applications, including site-directed mutagenesis [2,7], inhibition of enzymatic activity [26,33,34], and the development of artificial DNA cutters [35], with all of these depending on the sequence specificity of the invasion complex. The pseudopeptide backbone of PNA facilitates easy oligomer synthesis and modification for further improvements and imparts enhanced resistance to enzymatic degradation. Thus, double-duplex invasion is a promising method for the targeting of dsDNA in cellula.

*Appl. Sci.* **2020**, *10*, x FOR PEER REVIEW 2 of 9

stability of the DNA/DNA duplexes whilst slightly destabilising the PNA/DNA duplexes [36], which results in an overall decrease in invasion efficiency [37]. Several approaches have been attempted to enhance invasion efficiency in physiological salt concentrations, such as chemical modification of PNA and conjugation with functional molecules [37–41]. Our group developed conjugates of PNA with Ru-complexes and PNAs modified with a nuclear localisation signal (NLS) peptide to increase their applicability for DNA recognition at high salt concentrations (Figure 1b). These two modified PNAs displayed higher DNA affinity than corresponding unmodified pcPNAs and effectively formed an invasion complex even at physiological salt concentrations, whereunder unmodified PNAs stop functioning efficiently. It is important to understand the properties and limitations of these modified PNAs in order to further improve PNAs. Of these two methods, NLS modification is the simpler PNA modification method as the resulting NLS-modified PNAs (NLS-PNAs) can be procured by simply conjugating PNA with the functional peptide, and can be readily introduced

**Figure 1.** (**a**) Chemical structure of peptide nucleic acid (PNA). (**b**) Stable invasion complex formation **Figure 1.** (**a**) Chemical structure of peptide nucleic acid (PNA). (**b**) Stable invasion complex formation using nuclear-localisation-signal-modified PNAs (NLS-PNAs).

using nuclear-localisation-signal-modified PNAs (NLS-PNAs). **2. Materials and Methods**  *2.1. Materials*  Solvents and reagents were purchased from FUJIFILM Wako Pure Chemical Co. (Tokyo, Japan); Tokyo Chemical Industry Co., Ltd. (Tokyo, Japan); Merck (Darmstadt, Germany); Sigma-Aldrich Co., LLC (St. Louis, U.S.A.); nacalai tesque Inc. (Kyoto, Japan); and Kanto Chemical Co., Inc (Tokyo, Japan); and were used without further purification. PNA monomers were purchased from ASM Research Chemicals (Hannover, Germany). Oligonucleotides were purchased from Fasmac (Kanagawa, Japan). PNAs with or without an NLS peptide were synthesised by standard Boc-chemistry-based solidphase peptide synthesis according to a literature procedure [42], purified by reversed-phase highperformance liquid chromatography (HPLC) (Jasco Co.; Tokyo, Japan), and characterised by matrixassisted laser desorption/ionization–time of flight mass spectrometer (MALDI-TOF MS) (ultraflex III, Bruker Daltonics; Billerica, U.S.A.) (Figures S2 and S3). The concentration of PNAs and DNAs was determined based on their absorbance at 260 nm by using a UV-visible spectrophotometer (molar However, to realise such applications, an important issue must be overcome—namely, the high salt concentrations encountered in the cellular environment. High salt concentrations increase the stability of the DNA/DNA duplexes whilst slightly destabilising the PNA/DNA duplexes [36], which results in an overall decrease in invasion efficiency [37]. Several approaches have been attempted to enhance invasion efficiency in physiological salt concentrations, such as chemical modification of PNA and conjugation with functional molecules [37–41]. Our group developed conjugates of PNA with Ru-complexes and PNAs modified with a nuclear localisation signal (NLS) peptide to increase their applicability for DNA recognition at high salt concentrations (Figure 1b). These two modified PNAs displayed higher DNA affinity than corresponding unmodified pcPNAs and effectively formed an invasion complex even at physiological salt concentrations, whereunder unmodified PNAs stop functioning efficiently. It is important to understand the properties and limitations of these modified PNAs in order to further improve PNAs. Of these two methods, NLS modification is the simpler PNA modification method as the resulting NLS-modified PNAs (NLS-PNAs) can be procured by simply conjugating PNA with the functional peptide, and can be readily introduced during the synthesis of PNAs. Thus, in this study, we focused on NLS-PNAs and investigated their characteristics to obtain insight into how they can be improved.

#### extinction coefficients for PNA monomers (in M−1 cm−1): ε(D) = 7600; ε(C) = 6600; ε(G) = 11,700; ε(U) = **2. Materials and Methods**

#### 10,200). *2.1. Materials*

Solvents and reagents were purchased from FUJIFILM Wako Pure Chemical Co. (Tokyo, Japan); Tokyo Chemical Industry Co., Ltd. (Tokyo, Japan); Merck (Darmstadt, Germany); Sigma-Aldrich Co., LLC. (St. Louis, MO, USA); nacalai tesque Inc. (Kyoto, Japan); and Kanto Chemical Co., Inc. (Tokyo, Japan); and were used without further purification. PNA monomers were purchased from ASM Research Chemicals (Hannover, Germany). Oligonucleotides were purchased from Fasmac (Kanagawa, Japan).

PNAs with or without an NLS peptide were synthesised by standard Boc-chemistry-based solid-phase peptide synthesis according to a literature procedure [42], purified by reversed-phase high-performance liquid chromatography (HPLC) (Jasco Co.; Tokyo, Japan), and characterised by matrix-assisted laser desorption/ionization–time of flight mass spectrometer (MALDI-TOF MS) (ultraflex III, Bruker Daltonics; Billerica, MA, USA) (Figures S2 and S3). The concentration of PNAs and DNAs was determined based on their absorbance at 260 nm by using a UV-visible spectrophotometer

(molar extinction coefficients for PNA monomers (in M−<sup>1</sup> cm−<sup>1</sup> ): ε(D) = 7600; ε(C) = 6600; ε(G) = 11,700; ε(U) = 10,200).

#### *2.2. Invasion Experiments*

Target 130-bp DNA from pBR322 was incubated with a pair of pcPNAs in a 5 mM HEPES buffer (pH 7.0) at 50 ◦C for 1 h. After incubation, the solutions were subjected to a microchip electrophoresis system (MultiNA, Shimadzu; Kyoto, Japan) and the invasion efficiency was evaluated.

#### **3. Results and Discussion**

#### *3.1. A Positive E*ff*ect on Invasion E*ffi*ciency by the Introduction of an NLS into Short PNAs*

In a previous report, we successfully demonstrated that conjugation of an NLS to pcPNAs results in enhanced invasion efficiency [37]. Specifically, an NLS peptide with the sequence PKKKRKV found in the SV40 virus, which has been widely studied, was conjugated at the C-terminus of a pentadecamer PNA. If the introduction of an NLS, even in short PNAs, exerts a positive effect upon invasion efficiency, then the properties of NLS-PNAs should be more easily elucidated by examining them with short PNAs, whose synthesis requires less effort than longer PNAs. Thus, we checked the effectiveness of the NLS in decamer PNAs and employed two types of PNA: (1) unmodified PNAs (U-PNAs), i.e., a pair of decamer pcPNAs without an NLS (with only a few lysine residues for solubility) and (2) C-PNAs, which are identical in sequence to U-PNAs but with an NLS at their C-termini.

U-PNAs and C-PNAs were synthesised by standard *tert*-butoxycarbonyl (Boc)-based solid-phase peptide synthesis. The PNA sequences are listed in Table 1. PNA1 and PNA2 represent each strand of pcPNA that is complementary to 10 bp in the pBR322 plasmid (PNA1: GUUDCUGDUG and PNA2: CDUCDGUDDC). All PNAs have a free N-terminal amino group, with the C-terminal carboxylic acid being converted to an amide. The efficiency of invasion complex formation was evaluated by an electrophoresis mobility shift assay (EMSA). The formation of the invasion complex results in a lower electrophoretic mobility, mainly due to changes in the local structure of the dsDNA at the invasion site [43]. This allows the separation of the invasion complex from the free dsDNA. An EMSA of the invasion complex formation in the presence of 75 mM NaCl is shown in Figure 2.


**Table 1.** PNAs synthesised and used in this research.

[a] Bold: PNAs; Italic: amino acids; *K* = lysine; *P* = proline; *R* = arginine; *V* = valine; D = 2,6-diaminopurine; U = 2-thiouracil.

In lane 1, a single band corresponding to free dsDNA was observed between 100 and 150 bp. In the presence of U-PNA, the invasion complex was barely formed under the high-salt conditions employed, giving only a faint band with lower mobility between 180 and 200 bp (lane 2), which was assignable to the invasion complex. However, the C-PNAs with NLS at their C-termini gave an intense band corresponding to the invasion complex (lane 3). It is clear from these results that the invasion efficiency was improved by the conjugation of an NLS to the short PNAs, leading to the formation of a

mM NaCl.

stable invasion complex even at elevated salt concentrations. As a result, it was confirmed that the NLS modifications in short PNAs showed a beneficial effect on invasion. Thus, we decided to investigate the properties of NLS-PNAs by using these short decamer PNAs. The effect of NLS modification appears to be mainly attributable to electrostatic interactions. However, the conjugation of PNA with repeating cationic peptides (e.g., polylysine) also increases non-specific interactions with the DNA backbone, resulting in a negative effect on the formation of the invasion complex [44]. Presumably, in biological motifs, unfavourable sequences (e.g., those provoking non-specific binding to DNA) have been evolutionarily excluded. Thus, we can expect that NLS peptides possess an ideal amino acid sequence that shows a moderate affinity to DNA as well as a positive effect on invasion efficiency. *Appl. Sci.* **2020**, *10*, x FOR PEER REVIEW 4 of 9

**Figure 2.** Comparison of U-PNA and C-PNA invasion efficiency in the presence of 75 mM NaCl. M: 20-bp DNA Ladder (TaKaRa); lane 1: 130-bp DNA only; lane2 U-PNA; lane 3: C-PNA. Invasion conditions: [DNA] = [each PNA] = 50 nM, [HEPES (pH 7.0)] = 5 mM, and [NaCl] = 75 mM at 50 °C for **Figure 2.** Comparison of U-PNA and C-PNA invasion efficiency in the presence of 75 mM NaCl. M: 20-bp DNA Ladder (TaKaRa); lane 1: 130-bp DNA only; lane2 U-PNA; lane 3: C-PNA. Invasion conditions: [DNA] = [each PNA] = 50 nM, [HEPES (pH 7.0)] = 5 mM, and [NaCl] = 75 mM at 50 ◦C for 1 h.

#### 1 h. *3.2. Changing the Location of the NLS from the C-Terminus to the N-Terminus of the PNAs*

In lane 1, a single band corresponding to free dsDNA was observed between 100 and 150 bp. In the presence of U-PNA, the invasion complex was barely formed under the high-salt conditions employed, giving only a faint band with lower mobility between 180 and 200 bp (lane 2), which was assignable to the invasion complex. However, the C-PNAs with NLS at their C-termini gave an intense band corresponding to the invasion complex (lane 3). It is clear from these results that the invasion efficiency was improved by the conjugation of an NLS to the short PNAs, leading to the formation of a stable invasion complex even at elevated salt concentrations. As a result, it was confirmed that the NLS modifications in short PNAs showed a beneficial effect on invasion. Thus, we decided to investigate the properties of NLS-PNAs by using these short decamer PNAs. The effect of NLS modification appears to be mainly attributable to electrostatic interactions. However, the conjugation of PNA with repeating cationic peptides (e.g., polylysine) also increases non-specific interactions with the DNA backbone, resulting in a negative effect on the formation of the invasion complex [44]. Presumably, in biological motifs, unfavourable sequences (e.g., those provoking non-As mentioned above, in the previous report on NLS-PNAs, the NLS peptide was conjugated to the C-termini of the PNAs (C-PNAs). The synthesis of NLS-PNA was carried out from the C-terminus to the N-terminus by solid-phase peptide synthesis, and the NLS peptide was introduced to the resin first, followed by the PNA. When examining various functional peptides as alternatives to NLS for future research, N-terminally modified PNA conjugates, rather than C-terminally modified PNA conjugates, would significantly reduce the time and effort required for investigation. This is because it is possible to synthesise a large number of peptide-modified PNAs at once by elongating the PNAs first and then splitting the resin followed by peptide conjugation. Therefore, we evaluated whether changing the location of the NLS from the C-terminus to the N-terminus of the PNA (N-PNA) would have a detrimental effect upon invasion efficiency. PNAs modified with NLS peptide at the N-termini (N-PNAs) were synthesised using the same method as C-PNAs, and the invasion experiments with a series of PNAs were carried out under a broad range of salt concentrations, ranging from 0 to 125 mM NaCl.

specific binding to DNA) have been evolutionarily excluded. Thus, we can expect that NLS peptides possess an ideal amino acid sequence that shows a moderate affinity to DNA as well as a positive effect on invasion efficiency. *3.2. Changing the Location of the NLS from the C-Terminus to the N-Terminus of the PNAs*  As mentioned above, in the previous report on NLS-PNAs, the NLS peptide was conjugated to the C-termini of the PNAs (C-PNAs). The synthesis of NLS-PNA was carried out from the C-terminus to the N-terminus by solid-phase peptide synthesis, and the NLS peptide was introduced to the resin first, followed by the PNA. When examining various functional peptides as alternatives to NLS for future research, N-terminally modified PNA conjugates, rather than C-terminally modified PNA conjugates, would significantly reduce the time and effort required for investigation. This is because it is possible to synthesise a large number of peptide-modified PNAs at once by elongating the PNAs first and then splitting the resin followed by peptide conjugation. Therefore, we evaluated whether As a control, the invasion efficiency of U-PNAs was also examined at increasing salt concentrations. Figure 3 shows that the band intensity of the invasion complex with U-PNA decreased with increasing salt concentration and almost disappeared when the NaCl concentration exceeded 50 mM. This result reaffirms the aforementioned phenomenon, wherein elevated salt concentrations stabilise dsDNA whilst weakening PNA/DNA interaction, resulting in a lower invasion efficiency. When the same experiment was conducted using C-PNAs, however, a high invasion efficiency beyond 50 mM NaCl was observed. Unlike in our previous report with pentadecamer NLS-PNAs, in the present study, a decrease in invasion efficiency was observed above 75 mM NaCl. This is due to the lower concentration of PNA, which we selected herein to amplify any effect. The introduction of NLS, even in shorter PNAs, was once more confirmed to be beneficial for invasion at high salt concentrations. Next, the effect of NLS position on invasion efficiency was investigated using N-PNAs. At first glance, both C-PNAs and N-PNAs exhibited similar behaviour in terms of invasion efficiency. The invasion efficiencies of both NLS-PNAs remained high, up to about 75 mM NaCl, although this value gradually decreased with

a series of PNAs were carried out under a broad range of salt concentrations, ranging from 0 to 125

As a control, the invasion efficiency of U-PNAs was also examined at increasing salt concentrations. Figure 3 shows that the band intensity of the invasion complex with U-PNA decreased with increasing salt concentration and almost disappeared when the NaCl concentration exceeded 50 mM. This result reaffirms the aforementioned phenomenon, wherein elevated salt concentrations stabilise dsDNA whilst weakening PNA/DNA interaction, resulting in a lower invasion efficiency. When the same experiment was conducted using C-PNAs, however, a high increasing salt concentration. This comparison confirms that, regardless of the location of the NLS, PNA-NLS conjugates display similar performance. However, N-PNAs were slightly better in terms of resistance to higher salt concentrations (over 75 m>M) than C-PNAs, with a more pronounced effect at very high salt concentrations of 125 mM. comparison confirms that, regardless of the location of the NLS, PNA-NLS conjugates display similar performance. However, N-PNAs were slightly better in terms of resistance to higher salt concentrations (over 75 mM) than C-PNAs, with a more pronounced effect at very high salt concentrations of 125 mM.

about 75 mM NaCl, although this value gradually decreased with increasing salt concentration. This

*Appl. Sci.* **2020**, *10*, x FOR PEER REVIEW 5 of 9

invasion efficiency beyond 50 mM NaCl was observed. Unlike in our previous report with pentadecamer NLS-PNAs, in the present study, a decrease in invasion efficiency was observed above 75 mM NaCl. This is due to the lower concentration of PNA, which we selected herein to amplify any effect. The introduction of NLS, even in shorter PNAs, was once more confirmed to be beneficial for invasion at high salt concentrations. Next, the effect of NLS position on invasion efficiency was

**Figure 3.** Comparison of (**a**) U-PNAs', (**b**) C-PNAs', and (**c**) N-PNAs' invasion efficiency. Invasion conditions: [DNA] = [each PNA] = 50 nM, [HEPES (pH 7.0)] = 5 mM, and [NaCl] = 0–125 mM at 50 °C for 1 h. **Figure 3.** Comparison of (**a**) U-PNAs', (**b**) C-PNAs', and (**c**) N-PNAs' invasion efficiency. Invasion conditions: [DNA] = [each PNA] = 50 nM, [HEPES (pH 7.0)] = 5 mM, and [NaCl] = 0–125 mM at 50 ◦C for 1 h.

#### *3.3. Insertion of Proline between the PNAs and Valine of the NLS*

Besides the location of the NLS itself, there is a further difference between C-PNAs and N-PNAs. In order to maintain the directionality of the NLS, in C-PNAs, proline (P) of the NLS connects to the PNA, whereas in N-PNAs, the NLS is connected via valine (V). Unlike valine, proline possesses a unique cyclic structure bestowing it with a much higher backbone rigidity than any of the other canonical amino acids. In order to verify its significance, we synthesised N-Pro-PNAs, which are identical to N-PNAs except that a proline residue has been inserted between the PNA and valine. Interestingly, the insertion of proline proved unfavourable for invasion efficiency, and N-Pro-PNAs

displayed invasion properties between those of N-PNAs and C-PNAs (Figure 4). These results strongly indicate that the rigid amino acid connecting the NLS to PNA may unfavourably affect invasion. displayed invasion properties between those of N-PNAs and C-PNAs (Figure 4). These results strongly indicate that the rigid amino acid connecting the NLS to PNA may unfavourably affect invasion.

*Appl. Sci.* **2020**, *10*, x FOR PEER REVIEW 6 of 9

Besides the location of the NLS itself, there is a further difference between C-PNAs and N-PNAs. In order to maintain the directionality of the NLS, in C-PNAs, proline (P) of the NLS connects to the PNA, whereas in N-PNAs, the NLS is connected via valine (V). Unlike valine, proline possesses a unique cyclic structure bestowing it with a much higher backbone rigidity than any of the other canonical amino acids. In order to verify its significance, we synthesised N-Pro-PNAs, which are

*3.3. Insertion of Proline between the PNAs and Valine of the NLS* 

**Figure 4.** (**a**) Invasion experiments for N-Pro-PNAs to evaluate the effect of an amino acid between NLS and PNA. Invasion conditions: [DNA] = [each PNA] = 50 nM, [HEPES (pH 7.0)] = 5 mM, and [NaCl] = 0–125 mM at 50 °C for 1 h. (**b**) Effect of salt concentration on invasion efficiency with various PNAs (U-, C-, N-, and N-Pro-PNAs). The value for invasion efficiency was obtained by comparing **Figure 4.** (**a**) Invasion experiments for N-Pro-PNAs to evaluate the effect of an amino acid between NLS and PNA. Invasion conditions: [DNA] = [each PNA] = 50 nM, [HEPES (pH 7.0)] = 5 mM, and [NaCl] = 0–125 mM at 50 ◦C for 1 h. (**b**) Effect of salt concentration on invasion efficiency with various PNAs (U-, C-, N-, and N-Pro-PNAs). The value for invasion efficiency was obtained by comparing the band intensities in a microchip electrophotogram.

#### the band intensities in a microchip electrophotogram. **4. Conclusions**

**4. Conclusions**  In summary, we investigated the characteristics of NLS-PNA by employing a series of NLSmodified PNAs. These NLS-PNA conjugates were found to be superior to unmodified PNAs at elevated salt concentrations. The difference between C-terminally modified and N-terminally modified NLS-PNAs was not very significant, signifying that N-terminally modified PNA-NLS In summary, we investigated the characteristics of NLS-PNA by employing a series of NLS-modified PNAs. These NLS-PNA conjugates were found to be superior to unmodified PNAs at elevated salt concentrations. The difference between C-terminally modified and N-terminally modified NLS-PNAs was not very significant, signifying that N-terminally modified PNA-NLS conjugates can be readily used for research purposes using the same method as previously reported C-terminally modified conjugates. This facilitates the development of new functional peptide-modified PNA conjugates, since solid-phase peptide synthesis requires synthesis from the C-terminus to the N-terminus. As an example, for C-terminally modified PNAs, when multiple functional peptides bound to the same PNA need to be screened, each functional peptide must be synthesised separately. In the case of N-terminal modification, the synthesis of peptide-modified PNAs can be simplified by synthesising the individual PNAs only once, separating the resin beads into as many samples as needed and continuing the synthesis of each NLS. In addition to the finding that the position of the NLS has some influence on

invasion, we demonstrated that the amino acids between the NLS and PNA also play an important role. These findings also provide new perspectives for the future design of novel peptide-modified PNAs.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2076-3417/10/23/8663/s1, Figure S1: chemical structures of 2,6-diaminopurine (D) and 2-thouracil (U), Figure S2: HPLC charts of purified PNAs, Figure S3: MALDI-TOF MS spectra of purified PNAs.

**Author Contributions:** Investigation, G.U. and M.S.; writing—original draft preparation, Y.A. and G.U.; writing—review and editing, Y.A., G.U., M.S. and O.S.; supervision, Y.A. and O.S.; project administration, Y.A.; funding acquisition, Y.A. and O.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by JSPS KAKENHI grant no. 19K05730 to Y.A. from the Ministry of Education, Culture, Sports, Science, and Technology (Japan). This work was also supported in part by the Izumi Science and Technology Foundation to Y.A., Kato Memorial Bioscience Foundation to Y.A., and Integrated Research Consortium on Chemical Sciences to Y.A.

**Acknowledgments:** We thank Joshua Kyle Stanfield for checking the English of this manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*

## **Construction of an Enzymatically-Conjugated DNA Aptamer–Protein Hybrid Molecule for Use as a BRET-Based Biosensor**

#### **Masayasu Mie \* , Rena Hirashima, Yasumasa Mashimo and Eiry Kobatake**

Department of Life Science and Technology, School of Life Science and Technology, Tokyo Institute of Technology, 4259 Nagatsuta-cho, Midori-ku, Yokohama 226-8502, Japan; nrmsrh410@gmail.com (R.H.); mashimo.y.ab@m.titech.ac.jp (Y.M.); kobatake.e.aa@m.titech.ac.jp (E.K.)

**\*** Correspondence: mie.m.aa@m.titech.ac.jp

Received: 28 September 2020; Accepted: 27 October 2020; Published: 29 October 2020

**Abstract:** DNA-protein conjugates are useful molecules for construction of biosensors. Herein, we report the development of an enzymatically-conjugated DNA aptamer–protein hybrid molecule for use as a bioluminescence resonance energy transfer (BRET)-based biosensor. DNA aptamers were enzymatically conjugated to a fusion protein via the catalytic domain of porcine circovirus type 2 replication initiation protein (PCV2 Rep) comprising residues 14–109 (tpRep), which was truncated from the full catalytic domain of PCV2 Rep comprising residues 1–116 by removing the flexible regions at the N- and C-terminals. For development of a BRET-based biosensor, we constructed a fusion protein in which tpRep was positioned between NanoLuc luciferase and a fluorescent protein and conjugated to single-stranded DNA aptamers that specifically bind to either thrombin or lysozyme. We demonstrated that the BRET ratios depended on the concentration of the target molecules.

**Keywords:** DNA-protein conjugate; replication initiation protein; DNA aptamer; BRET-based biosensor

#### **1. Introduction**

DNA-protein conjugates are useful materials for biosensing applications [1]. Use of DNA in these conjugate molecules allows for both signal amplification as well as molecular recognition (e.g., using DNA aptamers) [2–5]. DNA aptamers bind to specific molecules with high affinity and specificity. Therefore, DNA aptamer-based biosensing systems have been constructed as conjugates of DNA aptamers and reporter molecules [6–8]. Single-stranded DNA (ssDNA) containing aptamer sequences are modified at the 5′ and/or 3′ ends with a fluorophore for use in construction of DNA aptamer-based biosensors [9–12]. In addition to modification of ssDNA with a fluorophore, a reporter protein can also be conjugated to ssDNA [3–5]. However, conventional methods for conjugation of ssDNA to a reporter protein require cumbersome procedures. To overcome these limitations, we developed a method for conjugation of ssDNA to a protein of interest fused with a replication initiation protein (Rep) [5,13]. In this method, a protein fused with Rep can be covalently linked to ssDNA via enzymatic reaction, without the need for any chemical modification of the ssDNA. Recently, the catalytic domain of porcine circovirus type 2 Rep comprising residues 1-116 (pRep) was employed to construct DNA-NanoLuc luciferase (NanoLuc) conjugates [5]. DNA-NanoLuc conjugates were applied for use in a DNA aptamer-sandwich assay system. Moreover, pRep retains its DNA binding activity regardless of whether the protein of interest is fused to the N-terminus or C-terminus.

Bioluminescence resonance energy transfer (BRET)-based biosensors are frequently used in biosensing methods [14–16]. For the design of intramolecular BRET sensors, sensory domains that lead to conformational changes are located between donor and acceptor molecules, such as luciferase and a fluorescent protein. Therefore, we decided to design a fusion protein for application as a BRET biosensor, in which pRep was positioned between luciferase and a fluorescent protein, and then a DNA aptamer was conjugated via pRep. It was anticipated that our designed fusion protein conjugates containing DNA aptamers might undergo conformational changes upon binding of a target molecule to the DNA aptamer (Figure 1). BRET efficiency is dependent on not only the distance between the donor and acceptor molecules, but also their relative orientations. In our design, it was speculated that larger change in the distance between donor and acceptor molecules would not be occurred even when a target molecule binds to the DNA aptamer. However, the BRET efficiency was expected to change through alteration of their orientation upon binding of a target molecule to the DNA aptamer. .

**Figure 1.** BRET-based biosensor with DNA-protein conjugates. The catalytic domain PCV2 Rep was fused with Venus and NanoLuc. A DNA aptamer with the PCV2 Rep recognition sequence was enzymatically conjugated to the fusion protein via Rep.

 we expected the orientation change upon binding of target molecules to sensor. Herein, we tested this idea by constructing a fusion protein in which pRep was positioned between Venus as an acceptor fluorophore and NanoLuc as an energy donor. Overlap of the bioluminescent emission spectrum with the excitation spectrum of the acceptor fluorophore is important for selection of a BRET molecule pair. The combination of Venus and NanoLuc has been previously applied in several BRET-based sensors [15,16]. Energy transfer occurs when the donor and acceptor molecules are within 10 nm of each other in proximity [14,17]. Based on the known structure of pRep, we expected that BRET between NanoLuc and Venus would occur in the designed fusion protein. As noted above, the BRET efficiency depends not only on the distance between the donor and acceptor molecules, but also their relative orientations. To overcome limitations associated with molecular orientation, linkers are generally inserted between the sensory domain and the BRET molecules to allow for a certain degree of movement of the BRET molecules [14]. pRep exhibits flexible regions at the N- and C-terminals [18]. In the present study, we expected the orientation change upon binding of target molecules to sensor. To allow movement of the BRET molecules only when the target molecules bind to the DNA aptamer conjugated to the fusion protein, we constructed a truncated pRep variant from the catalytic domain of PCV2 Rep comprising residues 14–109 (tpRep). By using tpRep, the BRET efficiency is expected to change through alteration of the orientation upon binding of the target molecule to the DNA aptamer. Herein, truncated pReps were expressed in *Escherichia coli* and subsequently purified, and the DNA binding abilities were evaluated. Finally, tpRep was used to construct DNA-protein conjugates with thrombin- and lysozyme-binding DNA aptamers, and the resultant DNA-protein conjugates were evaluated for use as a BRET-based biosensor.

#### **2. Materials and Methods**

#### *2.1. Construction of Plasmids*

For expression of Rep mutants, the plasmids pET-His-pRep1-109, pET-His-pRep14-116 and pET-His-tpRep, were constructed as follows. pReps with the flexible N- and/or C-terminus regions truncated were constructed by site-directed mutagenesis. First, truncated pRep without the flexible 7-amino acid C-terminus, namely pRep1-109, was constructed by site-directed mutagenesis using the primer set 5′ -TGGAGCTGTCGACTAAGGTACCCTC-3′ and 5′ -TAGTCGACAGCTCCACACTCCATTA-3′ . pET-His-pRep, which was previously constructed in our laboratory, was used as a template [5]. The resultant plasmid was named pET-His-pRep1-109. Truncated pRep without the flexible 13-amino acid N-terminus, namely pRep14-116, and tpRep were constructed by the same procedure using the primer set 5 ′ -CGAATTCCACAAACGTTGGGTCTTC-3′ and 5′ -ACGTTTGTGGAATTCGCCCATGGCATG-3′ . pET-His-pRep and pET-His-pRep1-109 were used as templates, respectively. The resultant plasmids were named pET-His-pRep14-116 and pET-His-tpRep, respectively.

The pET28-Venus∆10-pRep-NanoLuc-His plasmid for expression of the fusion protein Venus∆10-pRep-NanoLuc-His (VpRNH),inwhich the pRep proteinwas fused to theN-terminus ofNanoLuc and the C-terminus of Venus∆10, was constructed as follows. The Venus∆10 fragment was amplified by PCR using the primer set 5′ -tacgaattcagtaaaggagaagaacttttc-3′ and 5′ -ggcgaattcggtacccccagcagctgttac-3′ , and Nano-lantern(cAMP-1.6)/pRSETB (Addgene #53591) was used as a template. The amplified fragment was cloned into pUC18 and its sequence was confirmed. The resultant plasmid, pUC18-Venus∆10, was digested with *Eco*R I. The obtained fragment was inserted into the pET28-pRep-NanoLuc-His plasmid, which was previously constructed in our laboratory, and digested with the same restriction enzyme.

The pET28-Venus∆10-tpRep-NanoLuc-His plasmid for expression of the fusion protein Venus∆10-tpRep-NanoLuc-His (VtpRNH), in which the tpRep protein was fused to the N-terminus of NanoLuc and the C-terminus of Venus∆10, was constructed as follows. To destroy an *Eco*R I restriction site located at the beginning of the Venus∆10 sequence, a single nucleotide base of pET28-Venus∆10-pRep-NanoLuc-His was changed by site-directed mutagenesis using the primer set 5 ′ -GGCGATTTATGAGTAAAGGAGAAGAA-3′ and 5′ -ACTCATAAATTCGCCCATGGTATATCT-3′ . The resultant plasmid, pET28-delEcoRI-Venus∆10-pRep-NanoLuc-His, was digested with *Eco*R I and *Sal* I to remove the fragment encoding pRep and the inserted tpRep fragment derived from pET-His-tpRep was digested with the same restriction enzymes. The resultant plasmid was named pET28-Venus∆10-tpRep-NanoLuc-His.

#### *2.2. Protein Expression and Purification*

For protein expression, the respective plasmids were introduced into *E. coli* BL21(DE3)-competent cells. Transformed cells were inoculated in LB medium with 50 µg/mL ampicillin for expression of Rep mutants and with 20 µg/mL kanamycin for expression of the fusion proteins VpRNH and VtpRNH. Cells were then cultured at 37 ◦C until the OD<sup>660</sup> reached 0.6~0.8, followed by addition of 1 mM isopropyl-β-D(-)-thiogalactopyranoside (IPTG) for induction of protein expression. Cells were cultured overnight at 16 ◦C for expression of fusion proteins and for 6 h at 25 ◦C for expression of Rep mutants. Cells were then harvested by centrifugation. The collected cells were suspended in phosphate-buffered saline (PBS: 150 mM NaCl, 16 mM Na2HPO4, 4 mM NaH2PO4, pH 7.4) and disrupted by sonication, followed by centrifugation to obtain the soluble fraction. The supernatant was added to ProfinityTM IMAC Ni-Charged Resin (Bio-Rad) equilibrated with PBS followed by rotation at 4 ◦C for 30 min. After rotation, the samples were washed with Ni-NTA buffer (0.5 M NaCl, 20 mM phosphate buffer, pH 8.0) 5 times, and then washed twice with Ni-NTA buffer containing 10 mM imidazole. Proteins were eluted with Ni-NTA buffer containing 10, 100, 150 mM imidazole. The eluted samples were dialyzed against PBS 3 times using dialysis tubing. The concentrations of purified proteins were evaluated using a BCA assay kit (Pierce).

#### *2.3. Evaluation of DNA Binding Ability of pRep*

The ssDNA oligonucleotide Rep sub31 (5′ -AAGTATTACAAAAACCAGCGCAGTTGGGCAG-3′ ) was used for evaluation of the DNA binding ability of pRep. The underlined sequence denotes the PCV2 Rep recognition sequence. Purified proteins (5 µM) were mixed with Rep sub31 (5 µM) in 10 µL of reaction buffer, PBS with 2.5 mM MgCl2. After incubation for 30 min at 37 ◦C, the samples were run on sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE), and the gels were stained with Coomassie Brilliant Blue (CBB).

#### *2.4. Evaluation of Emission Spectra*

BRET of purified fusion proteins were evaluated using the Nano-Glo Luciferase Assay System (Promega). Fusion proteins (1 µM) in 50 µL of PBS were mixed with the same volume of Nano-Glo Luciferase Assay Reagent. Emission spectra (350–650 nm) was acquired 10 s after reagent addition using a FP6500 spectrofluorophotometer (JASCO, Tokyo, Japan). Emission spectra were normalized to the peak emission of NanoLuc at 450 nm, which was set to an intensity of 1.00.

#### *2.5. Homogeneous Assay with DNA Aptamer*

The single-stranded DNA aptamers for lysozyme, namely Lysoapt 42 (5′ -AAGTATTACATCTACGAATTCATCAGGGCTAAAGAGTGCAGAGTTACTTAG-3′ ), and for thrombin, namely TBA 29 (5′ -AAGTATTACAGTCCGTGGTAGGGCAGGTTGGGGTGACT-3′ ), containing the Rep recognition sequence, were used in the homogeneous assay [19,20]. Human α-thrombin was purchased from Haematologic Technologies. Equal amounts (1 µM each) of VtpRNH and DNA aptamer were mixed in 50 µL of PBS containing 0.5 mM of MgCl<sup>2</sup> followed by incubation at 37 ◦C. After incubation for 30 min, 10 µL of solutions containing varying concentrations of target molecules was added to the mixture of VtpRNH and DNA aptamer followed by incubation at room temperature for 1 h. For the thrombin-binding DNA aptamer-VtpRNH conjugate, samples containing bovine serum albumin (BSA) (instead of thrombin) were evaluated as a negative control. Samples were mixed with the same volume of Nano-Glo Luciferase Assay Reagent. Emission spectra (350-650 nm) were acquired 10 sec after addition of the reagent using a FP6500 spectrofluorophotometer. Emission spectra were normalized to the peak emission of NanoLuc at 450 nm, which was set to an intensity of 1.00. The change in BRET ratio (∆BRET) was calculated as follows:

## **∆BRET** = {(emission at 528 nm)/(emission at 450 nm)} − BRET basal ratio

The BRET basal ratio is defined as (emission at 528 nm)/(emission at 450 nm) at 0 µM of the target molecule.

#### **3. Results and Discussion**

#### *3.1. Truncation of the Catalytic Domain of PCV2 Rep*

We previously developed a method for site-specific conjugation of ssDNA to a protein of interest via the fused replication initiator protein (Rep), such as conjugation of Gene A\* from bacteriophage phiX 174 and the catalytic domain of porcine circovirus type 2 Rep comprising residues 1-116 (pRep) [5,13]. Recently, Gordon's group reported a similar strategy for construction of DNA-protein conjugates using several Reps, including pRep, as HUH-tags [21]. pRep is much smaller than Gene A\* and well-expressed in E. coli. The structure of pRep (Protein Data Bank (PDB) ID:2HW0) exhibits flexible regions at the Nand C-terminals [18]. In our designed fusion protein, Rep was positioned between Venus and NanoLuc. For construction of BRET-based biosensors, flexible linkers are generally inserted between the sensory domain and the BRET molecule to allow for a certain degree of movement of the BRET molecule [14]. In the present study, to allow for movement of BRET molecules only when the target molecule binds to the DNA aptamer conjugated to the fusion protein, use of truncated variants of pRep was required. Truncated variants of pReps were constructed and their DNA binding abilities were evaluated.

Truncated variants of pRep, namely pRep1-109 lacking the last 7 amino acids in the sequence, pRep14-116 lacking the first 13 amino acids in the sequence, and tpRep (pRep14-109) lacking the flexible regions at both the N- and C-terminals of pRep, were expressed in E. coli and subsequently purified. Truncated pReps were purified from the soluble fraction, similar to pRep. Proteins were mixed with ssDNA containing a Rep recognition sequence for evaluation of the DNA binding abilities of the truncated pReps. After incubation, samples were analyzed by SDS-PAGE (Figure 2). In addition to the bands appearing at the expected sizes of the protein, all samples also showed other bands at

higher molecular weights than pReps without conjugation. Those bands appearing at higher molecular weights were attributed to pReps conjugated with ssDNA. Upon cleavage of specific sequence of ssDNA, PCV2 Rep is conjugate to DNA covalently. In addition of this reaction, PCV2 Rep also catalyze the reaction in the opposite direction [18]. PCV2 Rep has an ability to catalyze joining two ssDNA fragments, a free 3′ -OH and the 5′ -phosphate covalently linked to Rep, for regeneration of PCV2 Rep recognition sequence. Therefore, there is a possibility to exist unreacted protein. However, this joining activity is not efficient compared to cleavage activity followed by conjugation of DNA with Rep. In this experiment, the ratio of protein and ssDNA is 1:1. By increasing ssDNA concentration, the conjugation efficiencies should increase. These results demonstrate that tpRep retains the ability to covalently bind to ssDNA even after truncation of the flexible regions located at N- and C-terminals of pRep. Therefore, tpRep was used for construction of a BRET-based biosensor with a DNA-protein conjugate. ′ ′

**Figure 2.** DNA binding abilities of truncated pRep proteins. The flexible region of the catalytic domain of PCV2 Rep comprising amino acids 1-116 was truncated from the full protein (pRep1-116, abbreviated pRep; MW 15,100); pRep1-109: pRep lacking the last 7 amino acids in the sequence (MW 14,300); pRep14-116: pRep lacking the first 13 amino acids in the sequence (MW 13,700); and pRep14-109, abbreviated tpRep: pRep lacking the flexible regions at both the N- and C-terminals of pRep (MW 12,900); white triangles indicate pRep without DNA and black triangles indicate pRep conjugated with DNA.

#### *3.2. Construction of DNA-Protein Conjugates with NanoLuc and Venus for BRET-Based Biosensor*

Δ Δ As shown in Figure 1, we designed a BRET-based sensor with DNA-protein conjugates. For construction of the DNA-protein conjugates, the catalytic domain of PCV2 Rep comprising residues 14–109 (tpRep) was fused to the N-terminus of NanoLuc and the C-terminus of Venus∆10 with a His-tag (Figure 3A). The resultant fusion protein, VtpRNH, was expressed in E. coli and subsequently purified from the soluble fraction with a His-tag located at the C-terminus. After purification, the activities of each domain of VtpRNH were evaluated. First, the DNA binding ability of tpRep in the fusion protein was evaluated (Figure 3B). Even after fusion with NanoLuc and Venus∆10, tpRe retained its DNA binding ability. The conjugation efficiencies were found increase with increasing DNA concentration. The maximum efficiency was around 50% (Figure 3C). We also evaluated the functions of both Venus and NanoLuc. Venus was shown to fluorescence in the fusion protein (Figure 3D). The emission peak was ~530 nm at an excitation wavelength of 500 nm [22].

Bioluminescence of NanoLuc in the fusion protein was also evaluated. The emission spectrum of the fusion protein exhibited a peak at ~450 nm without external excitation, which corresponds to the emission of NanoLuc [23]. These results suggest that VtpRNH retained its NanoLuc activity. Moreover, in addition to the emission peak around 450 nm, there was a small emission peak at ~530 nm. To identify the origin of this peak, the emission spectra were normalized to the peak emission of NanoLuc at 450 nm, which was set to an intensity of 1.00. As shown in Figure 3E, the emission peak of normalized intensity was observed at ~530 nm. The emission peak of Venus occurs at ~530 nm. These results suggest that BRET occurred between NanoLuc and Venus. We measured the emission spectrum after ssDNA binding via Rep to the fusion protein. Even after ssDNA binding to the fusion proteins via Rep, the emission peak at ~530 nm was observed (data not shown).

μ μ μ **Figure 3.** Evaluation of DNA binding ability of the fusion protein. Design of the constructed fusion protein, VtpRNH (MW 57,900) (**A**). DNA binding of tpRep in the fusion protein; the concentration of VtpRNH was 5 µM; white triangle indicates pRep without DNA conjugation and black triangle indicates pRep with DNA conjugation (**B**). DNA-conjugation efficiency of protein was evaluated by ImageJ (**C**). Emission spectrum of Venus at an excitation wavelength of 500 nm; the concentration of VtpRNH was 3 µM (**D**). NanoLuc activities of fusion proteins. The emission spectrum was normalized to the peak emission of NanoLuc at 450 nm; final concentration of VtpRNH was 0.5 µM (**E**).

#### *3.3. Construction and Evaluation of BRET-Based Biosensor with DNA-Protein Conjugates*

DNA aptamers are well known molecular binders with specificity to a specific molecule. In the present study, thrombin- and lysozyme-binding DNA aptamers were applied for the construction of a BRET-based biosensor with DNA-protein conjugates. For conjugation of DNA aptamer to the fusion protein via Rep, ssDNA comprised of the DNA aptamer and a Rep recognition sequence were reacted with the fusion protein.

μ Δ Δ Lysozyme-binding DNA aptamer was first conjugated to VtpRNH. After conjugation, the emission spectra of lysozyme binding DNA aptamer–protein conjugates with or without lysozyme were evaluated. As shown in insets of Figure 4A,B, the emission spectra were normalized to the peak emission of NanoLuc at 450 nm, which was set to an intensity of 1.00. In the presence of BSA (100 µM), the normalized emission spectra of both DNA-protein conjugates with or without BSA were similar (Figure 4A). On the other hand, presence of lysozyme, the normalized emission peaks at 528 nm were found to increase compared to the control without lysozyme (Figure 4B). These results suggest that BRET efficiencies change upon binding of lysozyme to the DNA aptamers. To confirm this speculation, we evaluated the emission spectra in the presence of different concentrations of lysozyme. As shown in Figure 4C, the changes in BRET ratio (∆BRET) increased with increasing lysozyme concentration. These results confirm that the lysozyme-binding DNA aptamer-VtpRNH conjugate can be used as a BRET-based sensor. The *K<sup>d</sup>* of the lysozyme-binding DNA aptamer is reported to be 30 nM [19]. Compared to this value, the sensitivity of our BRET-based biosensor with the lysozyme-binding DNA aptamer-VtpRNH conjugate is very low. As shown in Figure 3B, there were unreacted DNA. The sensitivity of sensor should be reduced by those unreacted DNA because unreacted DNA bind to target molecule competitively. However, in this case, it is not main reason for the low sensitivity. It was supposed that lysozyme formed aggregate in the presence of unreacted DNA caused by their net charge. Though monomeric lysozyme could not induce conformational change of fusion protein, BRET ratios would be changed by the aggregated lysozyme. The lysozyme binding DNA aptamer-VpRNH conjugate also showed similar results (Figure S1). We expected ∆BRET to be larger for the lysozyme-binding DNA aptamer-VtpRNH conjugate compared with the DNA-VpRNH conjugate because tpRep was truncated to remove the flexible amino acid sequences from the N- and C-terminals. In contrast, the change in BRET ratio for the DNA-VtpRNH conjugate was smaller than that for the DNA-VpRNH conjugate. In case of DNA-VpRNH with high concentration of lysozyme, it was speculated that BRET ratio was changed not only by changes of the orientation but also the distance.

Δ **Figure 4.** Evaluation of lysozyme-binding DNA aptamer–protein conjugates with and without target molecules. Normalized emission spectra of lysozyme-binding DNA aptamer-VtpRNH conjugate in the presence of BSA (**A**) and lysozyme (**B**). Change in BRET ratio (∆BRET) as a function of lysozyme concentration (**C**). Δ

Δ We also evaluated the thrombin-biding DNA aptamer conjugated to VtpRNH for use as a BRET-based biosensor. As shown in insets of Figure 5A,B, the normalized emission spectra of thrombin-binding DNA aptamer- and lysozyme-binding DNA aptamer-VtpRNH conjugates were evaluated in the presence of different thrombin concentrations. As shown in Figure 5A, the emission spectra of the lysozyme-binding DNA aptamer-VtpRNH conjugate were unchanged in the presence of different thrombin concentrations. On the other hand, the emission spectra of thrombin-binding DNA aptamer-VtpRNH conjugate increased with increasing thrombin concentration (Figure 5B). These results were also confirmed by calculation of ∆BRET for each of the conjugates (Figure 5C). These results suggest that the BRET ratio changes only occurred with the appropriate combination of DNA aptamer and target molecules. Δ

Δ Δ **Figure 5.** Evaluation of DNA aptamer-VtpRNH conjugate in the presence of different concentrations of thrombin. Conjugation with lysozyme-binding DNA aptamer (**A**) or thrombin-binding DNA aptamer (**B**). Change in BRET ratio (∆BRET) as a function of thrombin concentration (**C**). Conjugation with lysozyme-binding DNA aptamer (black diamonds) or thrombin-binding DNA aptamer (orange circles).

Δ Δ Finally, to demonstrate specificity of the conjugates, we evaluated the BRET ratio of the thrombin-binding DNA aptamer-VtpRNH conjugate in the presence of more different concentrations of thrombin or BSA. As shown in Figure 6, ∆BRET of the thrombin-binding DNA aptamer-VtpRNH conjugate increased with increasing thrombin concentration. On the other hand, ∆BRET did not increase in the presence of different BSA concentrations. It is suggested that the ∆BRET depends on the specific binding between DNA aptamers and target molecules. These results further suggest that the thrombin-binding DNA aptamer-VtpRNH conjugate can be used as a BRET-based thrombin biosensor. The *K<sup>d</sup>* of the thrombin-binding DNA aptamer is reported to be ~0.5 nM, which is different from the value obtained from our BRET-based biosensor with the thrombin-binding DNA aptamer-VtpRNH

conjugate [20]. One of the reasons of that is existence of unconjugated DNA. In this experiment, same amount of DNA aptamer and VtpRNH were mixed. From the result of Figure 3C, around half of DNA aptamer would remain as unreacted DNA. It should decrease the sensitivity of this sensor. To evaluate the property of BRET sensors, separations of DNA aptamer-VtpRNH conjugates are required. In addition, steric effects would reduce the sensitivity of the sensor. Steric effects are required to induce conformational changes. However, steric effects may also reduce the binding ability of the DNA aptamer. Therefore, the *K<sup>d</sup>* of DNA aptamers should increase in our BRET-based biosensor. The square of the correlation coefficient (*R* 2 ) of the linear fit equation between concentrations of 200 and 1500 nM is close to 1 (*R* <sup>2</sup> = 0.97). The square of the correlation coefficient was much closer to 1 compared to the results for the thrombin-binding DNA aptamer-VpRNH conjugate (*R* <sup>2</sup> = 0.78) (Figure S2). These differences may be caused by the existence of flexible regions with the pRep protein, which could allow for movement of the BRET molecules even after binding of the target molecule to the DNA aptamer. These results suggest that DNA aptamer-VtpRNH conjugates exhibit the potential for use as BRET-based biosensors. While the sensitivity of these biosensors was too low compared with other DNA aptamer sensors, the sensitivity might be improved by changing the design of the fusion protein and DNA aptamers.

Δ

Δ

Δ Δ **Figure 6.** Change in BRET ratio (∆BRET) of thrombin-binding DNA aptamer-VtpRNH conjugate as a function of thrombin concentration in the presence of thrombin (orange circles) and BSA (black triangles) (**A**). Correlation of ∆BRET and thrombin concentration between 200 and 1500 nM (**B**); each value represents the mean of 4 replicates (*n* = 4).

#### **4. Conclusions**

We constructed DNA aptamer–protein hybrid molecules for use as BRET-based biosensors. Fusion proteins were enzymatically conjugated to DNA aptamers via the catalytic domain of porcine circovirus type 2 replication initiation protein, which was positioned between NanoLuc luciferase and Venus. The catalytic domain of PCV2 Rep was shown to retain its DNA binding activity, even after elimination of the flexible regions at the N- and C-terminals. The resultant DNA aptamer–protein hybrid molecule exhibited a weak BRET signal, even in the absence of the target molecule of the DNA aptamer. However, the BRET signal was found to depend on the concentration of the target molecule. These results demonstrate the potential for use of the designed fusion protein as a DNA aptamer-based platform for construction of BRET-based biosensors.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2076-3417/10/21/7646/s1, Figure S1. Evaluation of lysozyme-binding DNA aptamer-VpRNH conjugates with and without target molecules. Normalized emission spectra of lysozyme-binding DNA aptamer-VpRNH conjugate in the presence of BSA (A) and lysozyme (B). Change in BRET ratio (∆BRET) as a function of lysozyme concentration (C), Figure S2. ∆BRET of thrombin of thrombin-binding DNA aptamer-VpRNH conjugate as a function of thrombin concentration (A). Correlation of ∆BRET and thrombin concentration between 200 and 1500 nM (B); each value represents the mean of 4 replicates (*n* = 4).

**Author Contributions:** Conceptualization, M.M.; methodology, M.M., R.H. and Y.M.; investigation, R.H.; writing—original draft preparation, M.M.; writing—review and editing, Y.M. and E.K.; supervision, M.M. and E.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported in part by JSPS KAKENHI Grant Numbers 16K01388 (M.M.) and 26289310 (E.K.).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **RNA-Peptide Conjugation through an E**ffi**cient Covalent Bond Formation**

#### **Shun Nakano, Taiki Seko, Zhengxiao Zhang and Takashi Morii \***

Institute of Advanced Energy, Kyoto University, Gokasho, Uji, Kyoto 611-0011, Japan; snaka@iae.kyoto-u.ac.jp (S.N.); seko.taiki.56a@st.kyoto-u.ac.jp (T.S.); zhang.zhengxiao.38m@st.kyoto-u.ac.jp (Z.Z.)

**\*** Correspondence: t-morii@iae.kyoto-u.ac.jp

Received: 12 October 2020; Accepted: 10 December 2020; Published: 14 December 2020

**Abstract:** Many methods for modification of an oligonucleotide with a peptide have been developed to apply for the therapeutic and diagnostic applications or for the assembly of nanostructure. We have developed a method for the construction of receptor-based fluorescent sensors and catalysts using the ribonucleopeptide (RNP) as a scaffold. Formation of a covalent linkage between the RNA and the peptide subunit of RNP improved its stability, thereby expanding the application of functional RNPs. A representative method was applied for the formation of Schiff base or dihydroxy-morpholino linkage between a dialdehyde group at the 3′ -end of sugar-oxidized RNA and a hydrazide group introduced at the C-terminal of a peptide subunit through a flexible peptide linker. In this report, we investigated effects of the solution pH and contribution of the RNA and peptide subunits to the conjugation reaction by using RNA and peptide mutants. The reaction yield reached 90% at a wide range of solution pH with reaction within 3 h. The efficient reaction was mainly supported by the electrostatic interaction between the RNA subunit and the cationic peptide subunit of the RNP scaffold. Formation of the RNP complex was verified to efficiently promote the reaction for construction of the RNA-peptide conjugate.

**Keywords:** ribonucleopeptide (RNP); RNA-peptide conjugate; Schiff base; aptamer; fluorescent sensors

#### **1. Introduction**

RNA and peptide conjugates have been constructed for versatile therapeutic application, such as the delivery of siRNA [1–3] or the screening of a library of peptides in a mRNA display method [4,5]. Efficient reaction for the formation of covalent linkage between RNA and peptide is important for developing practical applications. Many methods have been reported to conjugate an oligonucleotide (RNA or DNA) with an oligopeptide by applying various chemistries [6–9]. Post-synthetic coupling of the respective oligonucleotide and peptide fragments is the common and reliable method to construct oligonucleotide-peptide conjugates because the stepwise solid-phase synthesis of the conjugates encounters difficulties in finding the compatible protecting groups for both nucleobases and amino-acid side chains. Several chemical reactions were applied for coupling the nucleotide and peptide, such as amidation [10,11], disulfide bond formation [12–14], the modification of the thiol group by haloacetyl [15,16], or the maleimide group [12,17,18], native chemical ligation [19–21], and click reaction [22,23]. These reactions are certainly useful for the selective coupling of oligonucleotides and oligopeptides. However, these reactions often associate with drawbacks, such as the instability of the linkage, the necessity for introducing additional chemical groups into both the oligonucleotide and the oligopeptide chains by multistep reactions, limitation for the sequence of oligopeptide due to the solubility or reactivity of the side chains, and/or the potential formation of the side products and the stereoisomers. In some cases, even applying the long reaction time, the yields of the product were

rather low. One of the alternative methods for the modification of nucleotide with functional molecules was based on incorporating the aldehyde group into the nucleotide. Oxidation of the sugar moiety by perchloric acid or periodate is one of the common methods for introducing the aldehyde group on DNA or RNA [24–27]. By coupling the aldehyde group with alkoxyamine, cysteine, and hydrazine, formation of the covalent linkage through oxime, thiazolidine, and hydrazone (or a morpholine-like structure), respectively, has been reported [27–29]. These reactions proceed in mild conditions in a relatively short time to provide high coupling yields in aqueous solution. While this strategy also displays some of the limitations, as mentioned above, it has the advantage of having no need for the introduction of unnatural nucleic acids into oligonucleotide and facile preparation of the reactive peptide.

We have developed a stepwise method for the construction of receptor-based fluorescent sensors by using ribonucleopeptide (RNP) as a scaffold (Figure 1A). In this method, a complex of Rev peptide and Rev Responsive Element (RRE) RNA [30] was utilized to construct the RNP library with a randomized RNA sequence in the RNA subunit. RNP receptors [31] for a target molecule were selected from the RNP library by applying the in vitro selection method [32,33]. The peptide subunit of the selected RNP receptors was further modified with a fluorophore to construct a fluorophore-modified RNP receptor (F-RNP) library [34–42]. By screening the F-RNP library, fluorescent RNP sensors showing measurable fluorescent intensity changes upon binding the substrate were selected. Many kinds of RNP receptors and sensors were constructed for various target molecules, such as ATP [34–38,41,42], GTP [34,35,41], dopamine [39], and a tetra peptide containing a phosphorylated tyrosine residue [40]. Furthermore, the RNA subunit and the peptide subunit of fluorescent RNP sensors were covalently linked to improve the chemical and thermal stability [41]. The covalently linked RNP (c-RNP) sensors were applicable for simultaneous detection of multiple target molecules in the solution. Time-course monitoring of the concentration changes of the substrate and product in an enzymatic reaction was demonstrated by simultaneous application of sensors for the substrate and the product [41,42]. As mentioned above, for the covalent bond formation with the aldehyde group, the ribose moiety at the 3′ -end of RNA was oxidized by sodium periodate to a dialdehyde group (Figure 1B). A hydrazide group was introduced at the C-terminal of peptide subunit through a ten-amino-acid flexible linker. Coupling of the RNA dialdehyde group and the C-terminal hydrazide group of peptide formed a covalent linkage between the RNA and peptide subunits of the Rev-RRE complex. This reaction proceeded rapidly in a quantitative yield. It is likely that a proximity effect between the reactive groups on the RNA and peptide subunit originated from the complex formation of RNP to assist the efficient coupling reaction. Here, we investigated several conditions for covalent linkage formation between the dialdehyde group of RNA and the hydrazide group of peptide within the ribonucleopeptide scaffold to elucidate the effect of solution pH for the reaction yield. Furthermore, the role of proximity effect in the coupling reaction was investigated by using mutants of RNA and peptide. Our results demonstrated that the efficient coupling reaction within the RNP was indeed the outcome of the proximity effect of reactive groups and provided an optimal condition for the construction of the covalently linked RNA-peptide complex.

**Figure 1.** Schematic illustration for the construction of the covalently linked fluorescent ribonucleopeptide (RNP) sensor. RNA receptors selected from the RNA-derived RNP library were converted into a fluorescent RNP sensor library through modification of the peptide subunit by a fluorophore. (**A**) A fluorescent RNP sensor, selected from the fluorescent RNP sensor library, was converted to a covalently linked RNP (c-RNP) sensor by the crosslinking reaction between RNA and the peptide subunit [41]. (**B**) Scheme of covalent linkage formation between RNA and the peptide subunit of RNP.

#### **2. Materials and Methods**

α ′ ′ ′ PrimeSTAR HS DNA polymerase for PCR reactions was obtained from TaKaRa Bio Inc. (Shiga, Japan), and the T7-Scribe Standard RNA IVT Kit was obtained from CELLSCRIPT (Madison, WI, USA). *N*-α-Fmoc-protected amino acids, 2-(1H-benzotriazol-1-yl)-1,1,3,3-tetramethyluronium hexafluorophosphate (HBTU), 1-hydroxybenzotriazole (HOBt), distilled *N*,*N*-dimethylformamide (DMF), and 4-Hydroxymethylbenzoic acid-polyethylene glycol (PEG) resin (HMBA-PEG resin) were obtained from Watanabe Chemical Industries (Hiroshima, Japan). *N*,*N*-diisopropylethylamine (DIEA), diisopropylcarbodiimide (DIC), and 2′ ,4′ ,6′ -trihydroxyacetophenone monohydrate (THAP) were obtained from Sigma-Aldrich (St. Louis, MO, USA). *N*,*N*-dimethyl-4-aminopyridine (DMAP), sodium periodate, hydrazine monohydrate, gel electrophoresis grade acrylamide, bisacrylamide, phenol, thioanisole, and 1,2-ethanedithiol were purchased from Wako Chemicals (Tokyo, Japan). Diammonium hydrogen citrate (DAHC) was obtained from Nacalai Tesque (Kyoto, Japan). A reversed-phase C18 column ULTRON VX-ODS (For analysis: 4.6 × 150 mm; For purification: 20 × 250 mm, particle size 5 µm) was purchased from Shinwa Chemical Industries (Kyoto, Japan).

#### *2.1. Preparation of RNA*

′ ′ ′ ′ ′ ′ ′ ′ Forward and Reverse primers for the construction of the double-stranded template DNA for An16 RNA (5′ -TCTAATACGACTCACTATAGGTCTGGGCGCA-3′ and 5′ -GGCCTGTACCGTC-3′ ) and for scrAn16 RNA (5′ -TCTAATACGACTCACTATAGGAGGCTTCAGCTTCG-3′ and 5′ -CACAC AACCGCCCCG-3′ ) were purchased from Sigma–Aldrich, Japan (T7 RNA promoter is underlined). The double-stranded DNA templates and RNAs were prepared as previously described [42]. RNA subunits of RNP receptors were purified by means of denaturing polyacrylamide gel

electrophoresis (8 M urea, 12%). The concentrations of purified RNA were quantified by measuring the absorption at 260 nm (An16 = 439,500; scrAn16 = 438,800 M−<sup>1</sup> cm−<sup>1</sup> ).

#### *2.2. Construction of a Covalently Linked RNP*

The peptide subunit for the formation of a covalent linkage (Ac-TRQARRNRRRRWRERQR-GGSGGSGGSG-HZ) was synthesized as follows. A HMBA-PEG resin was placed in a dry flask, and a sufficient amount of DMF was added to soak the resin; this mixture was allowed to swell for 30 min. *N*-α-Fmoc-glycine (10 equivalent relative to resin loading) dissolved in DMF was mixed with a solution of DIC (5 equivalent relative to resin loading) in dry DMF on ice and then incubated for 20 min. The solution of an activated first amino acid was added to the resin prepared above. A DMF solution of DMAP (0.1 equivalent relative to resin loading) was added to the resin/amino acid mixture and incubated at room temperature for 1 h with occasional swirling. This procedure of coupling the first amino acid residue was repeated twice. The subsequent synthesis was performed on an automated peptide synthesizer (PSSM-8; Shimadzu, Kyoto, Japan) according to the Fmoc chemistry protocol using protected Fmoc-amino acids and HBTU. Acetylation of the *N*–terminal of peptide was performed by mixing with 1 M acetic acid anhydride and 1 M 1-methyl imidazole in DMF for 1 h at room temperature. The following procedures, cleavage of the protected peptide from the resin with hydrazine monohydrate, deprotection of the protected peptide, and purification of the peptide were performed as described previously [41]. The synthesized peptides were characterized by MALDI-TOF mass spectrometry (AXIMA-LNR, Shimadzu), as follows: Acetylated Rev peptide hydrazide (Ac-Rev-(GGS)3G-HZ), m/z 3154.9 (calcd for [M+H]<sup>+</sup> 3155.5); Acetylated flexible peptide linker hydrazide (Ac-(GGS)3G-HZ), m/z [M+H]<sup>+</sup> 735.3, [M+Na]<sup>+</sup> 757.8 (calculated for [M+H]<sup>+</sup> 734.3).

Crosslinking reaction of the RNA and the peptide subunits of RNP was carried out as described previously [41,42] with a slight modification of the conditions. Freshly prepared 0.01 M sodium periodate (5 µL; 50 nmol; 50 equiv) was added to 200 µM RNA (5 µL; 1 nmol) in 15 µL of 0.03 M sodium acetate (pH 5.2), and the reaction mixture (25 µL) was incubated for 1 h at 37 ◦C in the dark. After the reaction, 2.5 µL of 10 M glycerol was added to the reaction mixture to reduce an excess amount of sodium periodate. The resulting oxidized RNA was purified by ethanol precipitation. A coupling reaction between the 3′ -modified RNA (40 µM) and Ac-Rev-(GGS)3G-HZ (88 µM) was performed in 0.03 M sodium acetate (for pH 4 and 5) or 0.03 M sodium phosphate (for pH 6 and 7), containing 0.01 M NaCl (total 25 µL) at 37 ◦C in the dark. The reaction mixture was extracted by phenol/chloroform and purified by ethanol precipitation to remove unreacted peptide, then dissolved in TE (25 µL).

#### *2.3. Evaluation of the Reaction Yields of Covalently-Linked RNPs*

Denaturing polyacrylamide gel electrophoresis (PAGE) (8 M urea, 12%) was performed to separate the unreacted RNA and covalently linked RNP by loading the same volume of the purified samples. A total of 10 pmol of RNA was loaded on the gel as the control of RNA band mobility. The above stock solution of phenol/chloroform extracted RNP (25 µL) was further diluted with TE to 200 µL. From this RNP solution, 2 µL was analyzed by PAGE for each reaction. The acrylamide gel was stained with ethidium bromide to detect RNA and RNP. The yield of each reaction was evaluated from the ratio of intensity of bands corresponding to unreacted RNA and RNP. Actual isolation yields of RNP ranged from 30 to 40% after the PAGE purification and successive purification by ethanol precipitation.

#### *2.4. MALDI-TOF Mass Analysis of the Reaction Solution*

The reaction solution removed the unreacted peptide by phenol/chloroform extraction, and ethanol precipitation was characterized by MALDI-TOF mass spectrometry (AXIMA-Confidence, Shimadzu, Kyoto, Japan). A matrix solution was prepared as a mixture (*v*/*v*: 9/1) of 10 mg/mL of THAP in acetonitrile/H2O (*v*/*v*: 1/1) and 50 mg/mL of DAHC in pure water. The sample solution was mixed with the same volume of matrix solution (1 µL each). DNA oligonucleotides (molecular weight: 5828.9, 11,397.5, 13,976.2, 23,164.3) were used as the external calibration standard sample. All measurements were performed in linear positive mode.

′

#### **3. Results**

#### *3.1. E*ff*ect of Solution pH for the Formation of Covalent Linkage*

The effect of solution pH for the formation of covalent linkage between the 3′ -dialdehyde group of the oxidized RNA subunit and the hydrazide group at the C-terminal of peptide subunit of the Rev-RRE complex was investigated by carrying out the reaction at various pH levels. The RNA subunit of ATP-binding of the RNP receptor, An16, was oxidized by using sodium periodate in a sodium acetate buffer at pH 5. The reaction was performed at 37 ◦C for 1 h. After isolation of the oxidized RNA by ethanol precipitation, acetylated Rev peptide, modified with a C-terminal hydrazide group through a flexible linker (GGS)3G (Ac-Rev-(GGS)3G-HZ), was mixed with the oxidized RNA in sodium acetate buffer at pH 4 or 5 or in a sodium phosphate buffer at pH 6 or 7. The reaction mixtures were incubated for 3 h at 37 ◦C, then the reactions were stopped by the addition of phenol/chloroform and vigorously mixed. The water layer was condensed by ethanol precipitation. The reaction yield of c-RNP in each solution was evaluated by quantitation of the band intensity, corresponding to free RNA and conjugated c-RNP in a denaturing polyacrylamide gel electrophoresis (PAGE) (Figure 2). Over 80% of RNA was reacted with peptide in all conditions. The yields of reaction were 92 ± 4% at pH 4, 92 ± 5% at pH 5, 86 ± 3% at pH 6, and 82 ± 3% at pH 7 (Figure S1A). These results indicated that the solution pH ranging from 4 to 7 did not affect much to the yield of RNA-peptide conjugate. Prolonged incubation over 15 h at pH 5 did not show any difference in the ratio of the band intensity for c-RNP and unreacted RNA (data not shown).

**Figure 2.** A denaturing polyacrylamide gel electrophoresis (PAGE) analysis (8 M Urea) of the conjugation reaction products of c-RNPs at different pH conditions. The yield was calculated from the ratio of intensities of the bands corresponding to c-RNP and RNA. Lane 1: An16 RNA only; Lane 2: An16 and Ac-Rev-(GGS)3G-HZ reacted at pH 4; Lane 3: An16 and Ac-Rev-(GGS)3G-HZ reacted at pH 5; Lane 4: An16 and Ac-Rev-(GGS)3G-HZ reacted at pH 6; Lane 5: An16 and Ac-Rev-(GGS)3G-HZ reacted at pH 7.

#### *3.2. Contibution of the RNA-Peptide Interaction for the Reaction Yield*

RNA sequence dependency of the reaction was next evaluated to address the proximity effect of reactive groups. The specific interaction between the Rev peptide and RRE RNA of the RNP scaffold was expected to have an influence on the efficiency of covalent linkage formation. A sequence-scrambled scrAn16 RNA that contained the same number of each nucleotide to An16 RNA was prepared (Figure 3A). The reaction of the oxidized scrAn16 with Ac-Rev-(GGS)3G-HZ was performed at pH 5. The MALDI TOF MS analysis using the reaction solution after removing the unreacted peptide was performed to characterize the products. The production of covalently linked RNP was confirmed by observing respective MS peaks (Figures S2 and S3). In the five trials, scrAn16 showed a similar averaged reaction yield (92 ± 5%) to that of the parent An16 (92 ± 5%) when conjugated with the hydrazide group of Ac-Rev-(GGS)3G-HZ (Figure 3B and Figure S1B,D). The result showed that the RNA sequence did not affect the efficiency of reaction. The non-specific electrostatic interaction between the RNA and the cationic Rev peptide would relate to the reaction efficiency. To verify this notion, the reaction for the formation of covalently linked RNP was carried out by using a truncated peptide of Ac-Rev-(GGS)3G-HZ. A truncated peptide possessing only the GGS peptide linker moiety, Ac-GGSGGSGGSG-HZ, was designed by deleting the Rev peptide sequence from Ac-Rev-(GGS)3G-HZ. The truncated peptide showed a significant decrease in the yield to 57 ± 1%, even after a prolonged reaction time for 14 h (Figure 4 and Figure S1C). This result supported the notion that the electrostatic interaction between RNA and the peptide mainly contributed to increasing the reaction yield. Thus, the RNP scaffold was useful for the efficient production of RNA-peptide conjugates.

**Figure 3.** (**A**) Nucleotide and peptide sequence of An16 RNA, scrAn16 RNA, Ac-Rev-(GGS)3G-HZ peptide, and Ac-(GGS)3G-HZ peptide; (**B**) a denaturing PAGE (8 M Urea) analysis of the conjugation reaction products of c-RNPs constructed from An16 or scrAn16 RNA with Ac-Rev-(GGS)3G-HZ. Lane1: An16 RNA; Lane 2: An16 and Ac-Rev-(GGS)3G-HZ reacted for 3 h; Lane 3: scrAn16 RNA; Lane 4: scrAn16 and Ac-Rev-(GGS)3G-HZ reacted for 3 h; Lane 5: scrAn16 and Ac-Rev-(GGS)3G-HZ reacted for 14 h.

**Figure 4.** A denaturing PAGE (8 M Urea) analysis of the conjugation reaction products of c-RNP constructed from An16 and Ac-(GGS)3G-HZ. Lane 1: An16 RNA; Lane 2: c-An16/Ac-Rev-(GGS)3G-HZ reacted for 3 h; Lane 3: c-An16/Ac-(GGS)3G-HZ reacted for 3 h; Lane 4: c-An16/Ac-(GGS)3G-HZ reacted for 14 h.

#### **4. Discussion**

′ A Schiff base formation between the hydrazide and aldehyde group rapidly progressed at a mild acidic condition although the stability was less than that at a neutral pH. Because the reaction efficiency of this crosslinking method was based on the balance of such properties, the assays under the wide range of pH helped us to understand a limitation and the applicability of this reaction under the physiological condition for RNA-protein or RNA-peptide conjugation. Formation of the covalent linkage between RNA and the Rev peptide subunit in the Rev-RRE complex quantitatively proceeded in a mild acidic to a neutral pH condition within 3 h. The result indicated the versatility of this reaction in a physiological condition for the crosslink formation of peptide and RNA. Unlike the sequence-specific nature for the stable noncovalent RNP complex formation, formation of the covalent linkage between the 3′ -dialdehyde group of RNA and the hydrazide group at the C-terminal of Rev peptide required no specific RNA sequence for the RNA binding of Rev peptide. Even the nonspecific RNA-peptide complex formation driven by the electrostatic interaction was sufficient for exerting the proximity effect of reactive groups, thereby resulting an efficient formation of covalent linkage between the Rev peptide and RNA. The significant decrease of the reaction efficiency by deletion of the cationic moiety in the peptide sequence also showed the contribution of the electrostatic interaction for the efficient proceeding of the crosslinking reaction.

′ ′ We designed a DNA sequence-specific protein-tag that underwent a proximity-driven intermolecular crosslinking between protein and DNA [43–49]. Selective DNA modification by a self-ligating protein tag conjugated with a DNA-binding domain, termed as a modular adaptor, was achieved by relying on the chemoselectivity of the protein tag [50–52]. By tuning the alkylation kinetics for the protein-tag and its substrate, the sequence-specific crosslinking reaction of the modular adaptor was exclusively driven by the DNA recognition when the dissociation rate of the DNA complex was much larger than the rate constant for the alkylation reaction [45–49]. In connection with these findings, a sequence-specific crosslinking reaction between the 3′ -dialdehyde group of RNA and the hydrazide group of RNA-binding peptide would be realized by tuning the reaction conditions, such as the concentration of sodium salt. As we have reported previously, the crosslinking reaction between the 3 ′ -dialdehyde group of RNA and the hydrazide group of peptide was applied for the facile construction

of covalently crosslinked fluorescent RNP sensors [41,42]. The covalently crosslinked fluorescent RNP sensor not only realized an improved stability of RNP sensors but also expand its application, such as the simultaneous detection of multiple targets in the solution. The noncovalent RNP scaffold was effective for the library-based selection and the cooperative functionalization of RNP receptor, such as the fluorescent sensors and catalysts [53]. Once a functional noncovalent RNP was obtained, formation of a covalent linkage within the RNA-peptide complex provided stable RNP for a wide range of applications. A natural RNA oligonucleotide and facile preparation of the reactive peptide was only needed to perform this reaction. Our investigations regarding the effect of the pH, RNA, and a peptide sequence for the reaction efficiencies in this reaction would help us to understand the limitation and applicability of this method for construction of RNA-peptide or protein conjugate. It would also be helpful for the development of an emerging application of oligonucleotide-peptide conjugates, such as a self-assembled scaffold of nanostructure or an efficient delivery system of functional RNA into the cell.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2076-3417/10/24/8920/s1, Figure S1: The results of band intensity analysis of denaturing PAGE, Figure S2: MALDI TOF MS analysis of the reaction solution, Figure S3: Summary of the results of MALDI TOF MS analysis.

**Author Contributions:** S.N. and T.M. conceived and designed the experiments; S.N., T.S. and Z.Z. performed the experiments; T.M. supervised the project. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by JSPS KAKENHI Grant Numbers 18K14335 (S.N.) and 17H01213 (T.M.), Japan and by JST CREST Grant Number JPMJCR18H5 (T.M.), Japan.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Design of the Crosslinking Reactions for Nucleic Acids-Binding Protein and Evaluation of the Reactivity**

#### **Kenta Odaira, Ken Yamada, Shogo Ishiyama, Hidenori Okamura and Fumi Nagatsugi \***

Institute of Multidisciplinary Research for Advanced Materials, Tohoku University, Sendai 980-8577, Japan; k.odaira@kobayashi.co.jp (K.O.); Ken.Yamada@umassmed.edu (K.Y.); rare.earth.144@gmail.com (S.I.); hidenori.okamura.b8@tohoku.ac.jp (H.O.)

**\*** Correspondence: nagatugi@tohoku.ac.jp; Tel.: +81-22-217-5633

Received: 2 October 2020; Accepted: 29 October 2020; Published: 30 October 2020

#### **Featured Application: Alkylation for nucleic acids-binding protein.**

**Abstract:** Selective chemical reactions of biomolecules are some of the important tools for investigations by biological studies. We have developed the selective crosslinking reactions to form covalent bonds to DNA or RNA using crosslinking oligonucleotides (CFO) bearing reactive bases. In this study, we designed the cross-linkable 4-amino-6-oxo-2-vinyltriazine derivative with an acyclic linker (*acy*AOVT) to react with the nucleic acids-binding protein based on our previous results. We hypothesized that the *acy*AOVT base would form a stable base pair with guanine by three hydrogen bonds at the positions of the vinyl group in the duplex DNA major groove, and the vinyl group can react with the nucleophilic species in the proximity, for example, the cysteine or lysine residue in the nucleic acids-binding protein. The synthesized oligonucleotides bearing the *acy*AOVT derivative showed a higher reactivity than that of the corresponding pyrimidine derivative without one nitrogen. The duplex containing *acy*AOVT-guanine (G) formed complexes with Hha1 DNMT even in the presence of 2-mercaptoethanol. We expect that our system will provide a useful tool for the molecular study of nucleic acids-binding proteins.

**Keywords:** oligonucleotide; crosslink; nucleic acid binding protein; cytosine methyltransferase

#### **1. Introduction**

The binding of proteins to nucleic acids (DNA and RNA) is central to all aspects of gene expression regulation. DNA-binding proteins include the transcription factor to bind to specific DNA sequences and control the transcription rate of genetic information from the DNA to RNA [1,2]. Many DNA modifying enzymes are also included in the DNA binding protein, for example, repair enzymes [3,4] and epigenetic modification enzymes. DNA methyl transferase is one of the epigenetic modification enzymes and catalyses the transfer of the methyl group to DNA modifying the function of genes and affecting gene expression [5,6]. RNA-binding proteins have important functions in the post-transcriptional process, such as splicing regulation [7] and modulation of the mRNA translation. Recent studies have revealed that post-transcriptional gene regulation by non-coding RNAs is involved in most biological activities. The association of the RNA binding protein with ncRNA plays a crucial role in these biological functions [8–10]. The chemical tools for the control of the interaction between the nucleic acids-binding protein and nucleic acids have the potential for the development as the new strategy for artificial control of the gene expression.

Crosslinking reactions between nucleic acids and binding proteins are powerful tools for analyzing these interactions. Photoreactive functional groups, such as diazirines and benzophenones, were utilized for DNA or RNA-protein crosslinking [11–14], but lack selectivity for the target amino acid. Furthermore, due to their intrinsic reactivity, these groups react with water and other reactive chemical species resulting in low yields. Oligodeoxynucleotides (ODN) with reactive groups are exploited for their effective DNA-protein crosslinking reactions by the proximity effect of the specific DNA recognition with the binding protein. The reactive groups activated by oxidation, such as a diol [15] and furan [16], react with a proximal lysine or arginine in the protein or peptides. Hocek et al. reported that the ODN bearing a vinylsulfone amide [17] or chloroacetamide group [18] efficiently cross-linked with the p53 protein through alkylation of the cysteine in the proximal position with the reactive groups. Recently, they demonstrated that ODN with 2-vinylhypoxantine reacted with a thiol-containing minor groove binding peptide by proximity effect [19].

In our previous study, we reported the synthesis of the ODN bearing 4-amino-6-oxo-2-vinyltriazine with an ethyl linker (Et-AOVT: (**1**)) as a cross linkable derivative and an evaluation of its properties [20]. ODN showed a relatively high reactivity with the pyrimidine nucleobases of the complementary DNAs and the lowest reactivity to guanine. We hypothesized that the AOVT base would form a stable base pair with guanine (G) by three hydrogen bonds to be positioned at the vinyl group in the DNA major groove. In this study, we have designed an AOVT derivative with an acyclic linker (*acy*AOVT: (**2**)) as a cross-linkable probe of the nucleic acids-binding protein. The distance between the reactive base and sugar moiety of **2** is shorter than that of **1**, and consequently, the *acy*AOVT (**2**) should form a base pair with guanine (G) similar to the natural cytosine (C)-G pair. This assumption is supported by the molecular modeling shown in Figure 1. The molecular modeling revealed that the structure of the base pair of *acy*AOVT (**2**)-G and the natural C-G significantly overlap each other (Figure 1C). On the other hand, the structure of Et-AOVT (**1**)-G is subtly shifted from the natural base pair one (Figure 1B). The duplex DNA containing the *acy*AOVT (**2**)-G is expected to bind to the nucleic acid binding proteins similar to the natural duplex and form covalent linkages with the nucleophilic residues of the protein in the proximal position of the vinyl group (Figure 2).

**Figure 1.** (**A**) Structures of cross-linkable derivatives; (**B**,**C**) Superimposed structures between the natural duplex (pink) and the duplex contained **1** (**B**) or **2** (**C**) (green).

**Figure 2.** Design of the cross-linkable probe to nucleic acids-binding protein.

#### **2. Materials and Methods**

#### *2.1. General*

The <sup>1</sup>H-NMR spectra were obtained by 400 or 600 MHz spectrometers (Bruker, Billerica, MA, USA). The <sup>1</sup>H chemical shifts are described as δ values in ppm relative to acetone-d<sup>6</sup> (2.05 ppm), DMSO-d<sup>6</sup> (2.50 ppm), CDCl<sup>3</sup> (7.26 ppm), and tetramethylsilane (1H, 0.00 ppm). The <sup>13</sup>C NMR spectra were obtained by a Bruker 600 MHz spectrometer. The <sup>13</sup>C chemical shifts are described as the δ values in ppm relative to acetone-d<sup>6</sup> (29.84 ppm), DMSO-d<sup>6</sup> (39.52 ppm), and CDCl<sup>3</sup> (77.16 ppm). The <sup>31</sup>P-NMR spectra were recorded by a Bruker 500 MHz (202 MHz for <sup>31</sup>P). The multiplicity and qualifier abbreviations are as follows: s = singlet, d = doublet, t = triplet, q = quartet, quin. = quintet, sept. = septet, m = multiplet, br = broad. The electrospray ionization (ESI) mass spectra were recorded using a BioTOF II mass spectrometer or APEX III (Bruker Daltonics, Bruker, Billerica, MA, USA). The matrix-assisted laser desorption/ionization (MALDI-TOF) mass spectra were recorded Autoflex speed mass spectrometer and a laser at 337 nm in the negative mode using 3-hydroxypicolinic acid as the matrix or in the positive mode using 2,5-dihydroxybenzoic acid as the matrix. Thin-layer chromatography (TLC) was performed using silica gel 60 F<sup>254</sup> pre-coated plates. Column chromatography was performed using silica gel 60 N (spherical, neutral, 100–210 µm, Kanto Chemical, Tokyo, Japan). Flash chromatography was performed using Kanto Chemical silica gel 60 N (spherical, neutral, 40–50 µm). The ultraviolet-visible (UV-vis) absorption spectra were recorded using a DU800 spectrometer (Beckman Coulter, Brea, CA, USA). The ODN synthesis was carried out using an automated DNA synthesizer (392 DNA/RNA synthesizer, ABI, Foster City, CA, USA) following the standard phosphoramidite chemistry. High performance liquid chromatography (HPLC) was performed using a Cosmosil 5C18MSII column (4.6 or 10 × 250 mm, Nacalai Tesque, Kyoto, Japan), a PU-986 pump (JASCO, Tokyo, Japan), JASCO 2075 UV detector and a JASCO 2067 column oven. pH measurements were measured by a Seven Easy pH meter (Mettler Toledo, Columbus, OH) using an 8220BNWP electrode. The denaturing polyacrylamide gel plates were visualized and quantified using a FLA-5100 Fluor Imager (Fujifilm, Tokyo, Japan). Anhydrous methanol, DMF, THF, CH2Cl2, dioxane, pyridine, DMSO, CH3CN, toluene and THF were purchased from Wako Pure Chemical Industries, Ltd. (Tokyo, Japan). Unless otherwise noted, all the synthetic reactions were carried out at ambient temperature. The reactions requiring anhydrous conditions were achieved under an argon atmosphere in flasks dried under 1–3 mmHg. Commercially available reagents were obtained from Wako Pure Chemical Industries, Ltd., TCI, Inc., (Tokyo, Japan), Sigma-Aldrich Co. LLC (St. Louis, MO) and Kanto Chemical Co., Inc (Tokyo, Japan) and used without further purification. Human DNA (cytosine-5) methyltransferase (DNMT1) was obtained from New England Bio Labs Japan, Inc. (Tokyo, Japan)

#### *2.2. Synthesis of the Nucleoside Derivatives*

#### 2.2.1. Synthesis of 4-amino-6-oxo-1-(3′ ,5′ -*O*-t-butyldimethylsilyl-2′ ,4′ -dideoxy-d-ribityl) triazine (**4**)

The linker **3** (1.3 g, 3.0 mmol), which was prepared by a modified literature procedure [21], was added to 5-aza cytosine Na salt (1.0 g, 7.5 mmol) in DMSO (30 mL). The mixture was stirred at 60 ◦C for 18 h. The resulting mixture was diluted with EtOAc and quenched with NH4Cl. The product was extracted with EtOAc and the combined organic layer was washed with brine, dried over Na2SO4, filtered, and evaporated under reduced pressure. The resulting oil was purified by silica gel column chromatography (CH3OH/CHCl3, 1:100, then CH3OH/CHCl3, 1:50) to afford the corresponding N1-alkylation (772 mg, 1.74 mmol, 58%) and N3-alkylation (126 mg, 0.29 mmol, 10%) products.

N1-alkylation product: <sup>1</sup>H-NMR (400 MHz, CDCl3) δ 7.90 (s, 1H), 6.98 (brs, 1H), 5.65 (brs, 1H), 3.97 (ddd, 2H, J = 12.0, 12.0, 6.0 Hz), 3.90–3.77 (m, 2H), 3.64 (dd, 2H, J = 6.4, 6.4 Hz), 1.99–1.60 (m, 6H), 1.37–1.23 (m, 1 H), 0.89 (s, 9H), 0.87 (s, 9H), 0.07 (s, 3H), 0.06 (s, 3H), 0.03 (s, 6H); <sup>13</sup>C-NMR (125 MHz, CDCl3) δ 166.7, 158.7, 154.5, 67.4, 59.6, 45.1, 40.2, 36.2, 26.12, 26.09, 18.4, 18.3, −4.2, −4.3, −5.14, −5.16; HRMS (ESI) calcd. for C20H43N4O3Si<sup>2</sup> <sup>+</sup> [M + H]<sup>+</sup> *m*/*z* 443.2868, found *m*/*z* 443.2879.

N3-alkylation product: <sup>1</sup>H-NMR (600 MHz, CDCl3) δ 8.34 (s, 1H), 5.90 (brs, 1H), 5.61 (brs, 1H), 4.42–4.35 (m, 1H), 4.05–4.01 (m, 1H), 3.70–3.64 (m, 2H), 1.99–1.95 (m, 1H), 1.89–1.84 (m, 2H), 1.7–1.67 (m, 2H), 0.871 (s, 9H), 0.867 (s, 9H), 0.046 (s, 3H), 0.027 (s, 3H), 0.025 (s, 3H), 0.018 (s, 3H); <sup>13</sup>C-NMR (125 MHz, CDCl3) δ 170.4, 167.93, 167.88, 66.3, 64.5, 59.6, 40.3, 35.9, 25.9, 25.8, 18.2, 18.0, −4.6, −4.8, −5.36, −5.38; HRMS (ESI) calcd. for C20H43N4O3Si<sup>2</sup> <sup>+</sup> [M + H]<sup>+</sup> *m*/*z* 443.2868, found *m*/*z* 443.2879

2.2.2. Synthesis of 4-amino-2-(2-octylthioethyl)-6-oxo-1-(3′ ,5′ -*O*-t-butyldimethylsilyl-2′ , 4 ′ -dideoxy-d-ribityl) triazine (**6**)

A 28% aq NH4OH (10 mL) solution was added to a solution of the N1-alkylation product **4** (172 mg, 390 µmol) in dioxane-methanol (10 mL, 1/1, *v*/*v*), then sealed with a glass stopper. The mixture was stirred at 55 ◦C for 24 h. The resulting solution was evaporated under vacuum at 43 ◦C. The resulting moist solid was co-evaporated with ethanol × 3 to afford guanylurea **5**. This product was directly used for the next step without further purification. The orthoester (597 mg, 1.95 mmol, purity = 91%) was added to a solution of guanylurea (169 mg, 390 µmol) in DMF (0.2 M, 1.95 mL). The mixture was stirred at 120 ◦C for 2 h. The resulting yellow solution was diluted with Et2O. The product was extracted with Et2O and the combined organic layer was washed with NH4Cl aq., NaHCO<sup>3</sup> aq., and brine, dried over Na2SO4, filtered and evaporated. The crude product was purified by silica gel column chromatography (CHCl<sup>3</sup> then, CHCl<sup>3</sup> with 1% Et3N, then CH3OH/CHCl3, 1:300 with 1% Et3N) to afford **6** (198.3 µmol, 51%) as a yellow oil: <sup>1</sup>H-NMR (600 MHz, CDCl3) δ 5.70 (brs, 1H), 5.17 (brs, 1H), 4.02–3.89 (m, 3H), 3.65 (dd, 2H, *J* = 6.6, 6.6 Hz), 2.95–2.88 (m, 3H), 2.54 (t, 2H, *J* = 7.8 Hz), 1.93–1.88 (m, 1 H), 1.79–1.64 (m, 4H), 1.59 (quin, 2H, *J* = 7.2 Hz), 1.39–1.25 (m, 10H), 0.91 (s, 9H), 0.89 (t, 3H, *J* = 7.2 Hz), 0.88 (s, 9H), 0.089 (s, 3H), 0.086 (s, 3H), 0.04 (s, 6H); <sup>13</sup>C-NMR (150 MHz, CDCl3) δ 168.8, 165.3, 155.8, 67.6, 59.6, 41.6, 40.4, 35.8, 34.3, 33.0, 32.1, 29.9, 29.5, 29.2, 28.2, 26.19, 26.18, 22.9, 18.5, 18.3, 14.4, 0.26, −4.18, −4.26, −4.30, −5.1; HRMS (ESI) calcd. for C30H63N4O3SSi<sup>2</sup> <sup>+</sup> [M + H]<sup>+</sup> *m*/*z* 615.4154, found *m*/*z* 615.4158.

2.2.3. Synthesis of 2-(2-octylthioethyl)-6-oxo-4-acetylamino-1-(3′ ,5′ -*O*-t-butyldimethylsilyl-2′ , 4 ′ -dideoxy-d-ribityl) triazine (**7**)

AcCl (5.3 µL, 74.7 µmol) was added to a solution of **6** (15 mg, 24.9 µmol) in pyridine (250 µL). The mixture was stirred at rt for 19 h. The resulting solution was diluted with Et2O and quenched with NH4Cl. The product was extracted with Et2O and the combined organic layer was washed with brine, dried over Na2SO4, filtered and evaporated. The crude product was purified by silica gel column chromatography (CHCl3, then CHCl<sup>3</sup> with 1% Et3N) to afford **7** (13.4µmol, 54%) as a colorless oil: <sup>1</sup>H-NMR (600 MHz, CDCl3) δ 7.71 (br, 1H), 4.08–3.98 (m, 3H), 3.66 (t, 2H, *J* = 6.0 Hz), 3.65 (t, 2H, *J* = 6.0 Hz), 3.00 (t, 2H, *J* = 6.6 Hz), 2.91 (t, 2H, *J* = 6.6 Hz), 2.61 (s, 3H), 2.54 (t, 2H, *J* = 7.2 Hz), 1.95 (m, 1H), 1.81–1.64 (m, 3H), 1.58 (qt, 2H, *J* = 7.2, 7.2 Hz), 1.39–1.26 (m, 10 H), 0.92 (s, 9H), 0.91 (t, 3H, *J* = 7.2 Hz), 0.88 (s, 9H), 0.095 (s, 3H), 0.093 (s, 3H), 0.041 (s, 3H), 0.03 (s, 3H); <sup>13</sup>C-NMR (150 MHz, CDCl3) δ 172.0, 170.2, 161.5, 154.8, 41.9, 40.0, 34.9, 34.1, 32.8, 29.7, 29.5, 29.2, 28.8, 27.7, 25.87, 25.86, 25.85, 22.6, 18.2, 18.0, 14.1, −4.52, −4.55, −5.38, −5.41; HRMS (ESI) calcd. for C32H64N4O4SSi<sup>2</sup> <sup>+</sup> [M + H]<sup>+</sup> *m*/*z* 657.4246, found *m*/*z* 657.4246.

#### 2.2.4. Synthesis of 2-(2-octylthioethyl)-6-oxo-4-acetylamino-1-(2′ ,4′ -dideoxy-d-ribityl) triazine (**8**)

Boron trifluoride-diethylether complex (51 µL, 398 µmol) was slowly added to a solution of **7** (131 mg, 199 µmol) in CH3CN (800 µL) at 0 ◦C. After the addition, the mixture was stirred at 0 ◦C for 2 h. The resulting solution was diluted with EtOAc, then quenched with phosphate buffer (pH = 6.0, 1 M). The product was extracted with EtOAc and the combined organic layer was washed with brine, dried over Na2SO4, filtered and evaporated. The crude product was purified by silica gel chromatography, eluting with CH3OH/CHCl3, 1:50, then 1:10 to give **8** (103 µmol, 52%) as a white solid: <sup>1</sup>H-NMR (600 MHz, CDCl3) δ 7.80 (br, 1H), 4.38–4.33 (m, 1H), 4.00–3.96 (m, 1H), 3.91–3.88 (m, 1H), 3.85–3.81 (m, 2H), 3.15 (dt, *J* = 16.8, 1H, 7.2 Hz), 3.05 (dt, 1H, *J* = 16.8, 7.2 Hz), 2.93 (t, 2H, *J* = 7.2 Hz), 2.61 (s, 3H), 2.56 (t, 2H, *J* = 7.2 Hz), 1.95–1.90 (m, 1H), 1.82–1.71 (m, 3H), 1.59 (qt, 2H, *J* = 7.2, 7.2 Hz), 1.38–1.25 (m, 10H), 0.88 (t, 3H, *J* = 7.2 Hz); <sup>13</sup>C-NMR (150 MHz, CDCl3) δ 171.9, 170.9, 161.7, 156.0, 68.1, 61.63, 61.59, 41.7, 38.0, 36.2, 34.1, 32.8, 31.8, 29.6, 29.2, 28.8, 27.9, 25.9, 22.6,14.1; HRMS (ESI) calcd. for C20H36N4O4S + [M + H]<sup>+</sup> *m*/*z* 429.2528, found *m*/*z* 429.2530.

2.2.5. Synthesis of 2-(2-octylthioethyl)-6-oxo-4-acetylamino-1-(5′ -*O*-(4,4′ -dimethoxytrityl)-2′ , 4 ′ -dideoxy-d-ribityl) triazine (**9**)

DMTrCl (58 mg, 170 µmol) was added to a solution of **8** (497 mg, 114 µmol) in pyridine (487 µL) at 0 ◦C. After the addition, the mixture was stirred at 0 ◦C for 1 h. The resulting mixture was diluted with CH2Cl2, then quenched with NaHCO3. The product was extracted with CH2Cl<sup>2</sup> and the combined organic layer was washed with brine, dried over Na2SO4, filtered and evaporated. The crude product was purified by chromatography on silica gel, eluting with CHCl<sup>3</sup> and 1% Et3N, then CH3OH/CHCl3, 1:200 with 1% Et3N to give **9** (80 µmol, 70%) as a colorless oil: <sup>1</sup>H-NMR (600 MHz, CDCl3) δ 7.72 (br, 1H), 7.39–7.38 (m, 2H), 7.30–7.27 (m, 6H), 7.21–7.19 (m, 1H), 6.83–6.80 (m, 2H), 4.20 (dt, 1H, *J* = 13.2, 7.2 Hz), 4.04 (dt, 1H, *J* = 13.2, 7.2 Hz), 3.15 (dt, 1H, *J* = 16.8, 7.2 Hz), 3.06 (dt, 1H, *J* = 16.8, 7.2 Hz), 2.96 (t, 2H, *J* = 7.2 Hz), 2.60 (s, 3H), 2.54 (t, 2H, *J* = 7.2 Hz), 1.96–1.91 (m, 1H), 1.83–1.77 (m, 1H), 1.69–1.66 (m, 2H), 1.57 (qt, 2H, *J* = 7.2, 7.2 Hz), 1.37–1.26 (m, 10H), 0.87 (t, 3H, *J* = 7.2 Hz); <sup>13</sup>C-NMR (150 MHz, CDCl3) δ 172.0, 170.8, 161.6, 158.5, 155.3, 144.5, 135.7 135.6, 129.88, 129.86, 128.0, 127.9, 126.9, 113.2, 86.9, 68.4, 62.4, 55.2, 42.1, 36.5, 35.5, 34.1, 32.6, 31.8, 29.5, 29.19, 29.17, 28.8, 25.9, 22.6, 14.1; HRMS (ESI) calcd. for C41H54N4O6S <sup>+</sup> [M + H]<sup>+</sup> *m*/*z* 731.3837, found *m*/*z* 731.3834.

2.2.6. Synthesis of 2-(2-octylthioethyl)-6-oxo-4-acetylamino-1-(3′ -*N*,*N*-diisopropyl cyanoethyl-phosphoramidyl-5′ -*O*-(4,4′ -dimethoxytrityl)-2′ ,4′ -dideoxy-d-ribityl) triazine (**10**)

DIPEA (83.2 µL, 14.3 µmol) was added to a solution of **9** (15.2 mg, 20.8 µmol) in CH2Cl<sup>2</sup> (416 µL) and the mixture was cooled to 0 ◦C. NCCH2CH2OP[(N(*i*-Pr)2]Cl (9.3 µL, 41.6 µmol) was then added to the mixture. After the addition, the mixture was stirred at 0 ◦C for 1 h, diluted with CH2Cl2, then quenched with NaHCO3. The product was extracted with CH2Cl<sup>2</sup> and the combined organic layer was washed with brine, dried over Na2SO4, filtered and evaporated. The crude product was co-evaporated with toluene, then purified by chromatography on silica gel, eluting with EtOAc/hexane, 1:1 with 1% Et3N, then 2:3 with 1% Et3N to give **10** (15.0 µmol, 72%) as a colorless oil mixture of two diastereomers: <sup>1</sup>H-NMR (600 MHz, CDCl3) δ 7.65 (br, 1H), 7.41 (d, 2H, *J* = 7.2 Hz), 7.31–7.27 (m, 6H), 7.21–7.18 (m, 1H), 6.83–6.81 (m, 4H), 4.22–4.04 (m, 3H), 3.82–3.49 (m, 4H), 3.79 (s, 6H), 3.22–3.15 (m, 2H), 3.10–2.95 (m, 2H), 2.91–2.86 (m, 2H), 2.64–2.61 (m, 1H), 2.62–2.61 (m, 3H), 2.55–2.50 (m, 2H), 2.48–2.46 (m, 1H), 2.01–1.77 (m, 4H), 1.59–1.54 (m, 2H), 1.39–1.25 (m, 10H), 1.16–1.03 (m, 12H), 0.88 (t, 3H, *J* = 7.2 Hz); <sup>13</sup>C-NMR (150 MHz, CDCl3) δ 172.04, 172.00, 170.4, 170.2, 161.44, 161.40, 158.3, 154.9, 154.7, 145.10, 145.05, 136.32, 136.30, 136.29, 136.27, 130.0, 128.1, 128.0, 127.7, 126.7, 126.6, 118.0, 117.6, 113.0, 86.0, 77.2, 77.0, 76.8, 70.4, 70.3, 70.0, 69.9, 60.4, 60.0, 58.0, 57.8, 57.4, 57.3, 55.18, 55.15, 43.08, 43.05, 43.00, 42.97, 42.0, 41.9, 36.82, 36.80, 36.25, 36.23, 34.4, 34.3, 34.2, 34.04, 33.95, 32.82, 32.79, 31.8, 29.53, 29.52, 29.18, 29.16, 28.84, 28.83, 27.7, 25.9, 24.80, 24.75, 24.60, 24.56, 24.52, 24.50, 22.6, 21.1, 20.50, 20.45; <sup>31</sup>P- NMR (202 MHz, CDCl3) δ 147.70, 146.72; HRMS (ESI) calcd. for C50H71N6O7PS<sup>+</sup> [M + H]<sup>+</sup> *m*/*z* 931.4915, found *m*/*z* 931.4919.

#### *2.3. Synthesis of the Oligodeoxynucleotides Containing acyAOVT Derivatives*

**ODN1** and **2** were synthesized on a 1 µmol scale by an ABI 392 DNA/RNA synthesizer with standard β-cyanoethyl chemistry. 5′ -Terminal dimethoxytrityl-bearing ODN**1** and **2** was removed from the solid support by treatment with 45 mM K2CO3-MeOH containing 10 mM 1-octanethiol (0.5 mL) and the residue was evaporated under reduced pressure. The crude product was purified by reverse phase HPLC with a C-18 column (Nacalai Tesque: Cosmosil 5C18-MS-II, 10 × 250 mm) by a linear gradient of 5–40%/25 min of acetonitrile in 0.1% TEAA buffer at the flow rate of 4 mL/min. The dimethoxytrityl group of the purified **ODN** was removed with 10% AcOH for 30 min and the mixture was additionally purified by reverse phase HPLC to afford **ODN1**; MALDI-TOF MS (*m*/*z*):

calcd. for [M-H]- *m*/*z* 4045.8746; found 4045.780; **ODN2**; MALDI-TOF MS (*m*/*z*): calcd. for [M-H]- *m*/*z* 6960.8115; found 6958.589.

To a solution of **ODN1** or **2** (70 µM) in ddH2O was added a solution of MMPP (1 mM) in carbonate buffer (pH = 10) at room temperature. After 1 h, 5% AcOH was added to the mixture and the mixture was left for an additional 2 h to give **ODN3**. MALDI-TOF MS ODN (SOOct): calcd. for [M-H]- *m*/*z* 4061.8736, found 4061.755, **ODN5** (vinyl): calcd. for [M-H]- *m*/*z* 3899.5826, found 3899.304; **ODN6** (vinyl): calcd. for [M-H]- *m*/*z* 6814.5195, found 6811.219

ODN**7** was prepared by using 1.0 µM ODN**5** and 100 mM sodium thiomethoxide in 50 mM MES buffer (pH = 7.0) at 37 ◦C for 19 h. The reaction mixture was purified by reverse phase HPLC to afford ODN**7**: calcd. for [M-H]- *m*/*z* 3947.6856, found 3947.309.

#### *2.4. General Procedure for the Crosslinking Reactions*

The reaction was performed using 5.0 µM **ODN5** and 2.0 µM of the target DNA or RNA labelled by fluorescein at the 5′ -end in a buffer of 100 mM NaCl and 50 mM MES buffer (pH = 7.0). The reaction was incubated at 37 ◦C for 1–24 h. The reaction was quenched with the addition of loading dye (95% formamide, 20 mM EDTA, 0.05% xylene cyanol, and 0.05% bromophenol blue). The cross-linked products were analyzed by a denaturing 20% polyacrylamide gel electrophoresis containing urea (7 M) with TBE buffer at 200 V for 1 h. The labelled bands were visualized and quantified using a FLA-5100 Fluor Imager.

#### *2.5. Tm Measurement*

All samples for the *T<sup>m</sup>* measurements consisted of 100 mM NaCl, 25 mM MES buffer (pH = 7.0) and 1 µM Duplex. The *T*<sup>m</sup> measurements were performed using a temperature controller. Both the heating and cooling curves were measured three times over the temperature range of 25 ◦C to 80 ◦C at 0.5 ◦C/min. The absorbance at 260 nm was recorded every 0.5 ◦C.

#### *2.6. Alkylation with ODN Probe to HhaI DNMT-1*

The duplex oligo (final 1 µM) in pH 7.5 reaction buffer (50 mM Tris-HCl, 10 mM EDTA, 5 mM 2-mercaptoethanol) was pre-incubated at 37 ◦C for 5 min. HhaI Methyltransferase was then added to the reaction mixture and the mixture was pre-incubated at 37 ◦C (or 4 ◦C) for another 5 min. SAM was added to the mixture and the mixture was incubated at 37 ◦C (or 4 ◦C) for 0.5–24 h. A 4.5 µL aliquot of the reaction mixture (4.5 µL: for silver stain, 12 µL: for Flamingo gel stain) was used every time. Loading buffer (6×Tris.HCl, SDS, glycerol, 0.05% bromphenol blue, 200 mM DTT (final 20 mM)) was added and the mixture was heated at 65 ◦C for 15 min. The samples were separated by 10% SDS-PAGE (0.025 M Tris, 0.192 M glycine, 0.1% SDS) at room temperature (10 mA/1.5 h then 20 mA/1.0 h). Visualization of the 5′ - FAM oligonucleotide was performed by fluorescence imaging using an FLA-5100 Fluor Imager. Visualization of the protein was performed by silver stain or Flamingo gel stain.

#### **3. Results**

#### *3.1. Synthesis of the acyAOVT Nucleoside Phosphoramidite*

We planned to synthesize the C6-substituted 5-azacytosine by ring opening of the 5-azacytosine moiety forming guanylurea and subsequent ring closing reaction that follows a previous report (Scheme 1) [20]. In our previous study, the acyclic linkers-protected MOM or Bn group produced the coupling products in low yields. The acyclic side chain **3**-protected TBS group was synthesized from 2′ -deoxy-d-ribose by a previous modified procedure [21]. The coupling reaction between the sodium salt of 5-azacytosine and the acyclic side chain compound **3** was done to form the N1- and N3- alkylated products. The reaction conditions were screened in order to optimize the formation of the N1-alkylated product **4** and a considerable amount of the N1-alkylation (58%) in DMSO at 60 ◦C was achieved with a N1:N3 ratio = 6:1. The formation of the glycosidic bond at the N1 position of 5-azacytosine was confirmed by HMQC and HMBC analyses.

**Scheme 1.** *Reagents and conditions*: (**a**) 5-azacytosine sodium salt, DMSO, 60 ◦C, 18 h, 58%; (**b**) 7M NH4OH, MeOH-1,4-dioxane (1:1), 55 ◦C, 24 h; (**c**) orthoester, DMF, 120 ◦C, 2 h, 51% (2 steps); (**d**) AcCl, pyridine, rt, 19 h, 54%; (**e**) BF<sup>3</sup> ·OEt, CH2Cl<sup>2</sup> , 0 ◦C, 2 h, 52%; (**f**) DMTrCl, pyridine, 0 ◦C, 1 h, 70%; (**g**) 2-Cyanoethyl-*N*,*N*-diisopropylchloropho sphoramidite, DIPEA, CH2Cl<sup>2</sup> , 0 ◦C, 1 h, 72%.

The 5-azacytidine derivative **4** was treated with NH<sup>3</sup> to give the guanylurea intermediate **5**, which was condensed with the orthoester to afford the desired C6-octylthioethyl-5-azacytidine derivative **6** in 45% yield (2 steps). After the 4-*N*-acetylation and the deprotection of the TBS groups by BF3OEt2, the 5′ -hydroxyl group was selectively protected with the DMTr group, then the 3′ -hydroxyl group was phosphitylated to yield the *acy*AOVT nucleoside phosphoramidite **10**.

#### *3.2. Synthesis of the Oligonucleotides Containing acyAOVT and Evaluation of the Crosslinking Reactivity*

We synthesized two kinds of **ODNs1**, **2** by a DNA synthesizer using phosphoramidite **10** (Scheme 2). The synthesized ODNs were treated with 45 mM K2CO3/MeOH containing 10 mM 1-octanethiol to cleave them from the resin. After the DMTr-ON purification by reverse-phase (RP) HPLC, detritylation was carried out in an aqueous 10% AcOH solution at room temperature for 30 min. The obtained ODNs were further purified by RP-HPLC and were characterized by a MALDI-TOF MS analysis. The sulfide group of ODNs**1**, **2** was oxidized with magnesium monoperoxyphthalate (MMPP) and treated with aqueous 10% AcOH to give ODNs**5** and **6** according to the reported procedure [20]. The characterization of the ODNs was performed by MALDI-TOF MS.

**Scheme 2.** *Reagents and conditions*: (**a**) DNA synthesizer; (**b**) (i) 45 mM K2CO<sup>3</sup> -MeOH, 10 mM 1-octanethiol, rt, 4 h; (ii) 10% AcOH, rt, 30 min.; (**c**) 1 mM MMPP (2 eq) in carbonate buffer (pH 10), rt, 1 h; (**d**) 5% aq. AcOH, rt, 2 h.

The crosslinking reactions using the reactive ODN**5** to initially the complementary target DNA**1** or RNA**1** labelled with fluorescein at the 5′ end were investigated under neutral conditions. The reaction mixture was analyzed by 20% polyacrylamide gel electrophoresis (PAGE) containing 7 M urea and the yields of the cross-linked product were calculated based on the fluorescent intensity of each band observed on the gels. Figure 3 shows a comparison of the reactivity of ODN**5** toward the different bases at the target site of the DNA**1** or RNA**1**. ODN**5** showed relatively high crosslinking reactivity to

dC, dT, and dG, except for dA in the DNA. On the other hand, the crosslinked product was observed in high yields with rC and rU, and no significant products were observed with rG and rA in the RNA. The comparison of the reactivity to DNA or RNA with acyAOVT **2**, 4-amino-6-oxo-2-vinyl pyrimidine having an acyclic linker (**11:** acyAOVP) and Et-AOVT **1**, is shown in Figure 4. By comparison, acyAOVT **2** and acyAOVP **11** showed a significant difference in the reaction rate and the base selectivity to the target DNA and RNA (Figure 4A,B).

**Figure 3.** Denaturing gel electrophoresis of crosslink reaction to target DNA**1** and RNA**1** with **ODN5.** The reaction was performed in 50 mM MES-buffer (pH 7.0) containing 100 mM NaCl, **ODN5** (5 µM) and cDNAs or cRNAs (2 µM) at 37 ◦C.

The reaction rate with *acy*AOVT **2** was faster than that of *acy*AOVP **11**. The higher reaction rate with *acy*AOVT **2** might be attributed to the electron-withdrawing effect of the triazine increasing the reactivity of the 6-vinyl group of *acy*AOVT **2** than that of the pyrimidine-type *acy*AOVP **11**. Compared to the target base selectivity in **DNA1** with previously reported the *acy*AOVP **11** [21], AOVT **1** and **2** showed a drastically increased reactivity with the cytosine. On the other hand, the crosslinking yields to **RNA1** with AOVT **1** and **2** were significant higher to cytosine and uracil, and relatively lower to guanine than that of AOVP **11**. The reaction rate and selectivity to the pyrimidine bases (C, T and U) with *acy*AOVT **2** was comparable to that of Et-AOVT **1**. The reactivity to adenine (A) in DNA and RNA with **2** was significantly lower than that of Et-AOVT **1** (Figure 4A,C).

Next, we carried out a thermal denaturing study of the duplex containing the non-reactive *acy*AOVT. The **ODN7** bearing the non-reactive *acy*AOVT (SMe) was prepared by the addition of NaSMe to the vinyl derivative under weak acidic conditions for avoiding the decomposition of *acy*AOVT to guanylurea (Scheme 3).

**Figure 4.** Comparison of the reaction yields calculated from the gel electrophoresis analysis of the cross-linking to target DNA**1** (**Y** = dT, dG, dC, dA) or RNA**1** (**N** = U, G, C, A) with *acy*AOVT (**2**) (**A**), *acy*AOVP (**11**) (**B**) and Et-AOVT (**1**) (**C**).

**Scheme 3.** Synthesis of the ODN bearing a stable precursor.

The melting temperature (*Tm*) of the **ODN7**/**DNA1** or **RNA1** is summarized in Figure 5. The *Tm* values of the **ODN7**/**DNA1** or **RNA1** were observed to be lower than that of the unmodified natural duplex (*T*<sup>m</sup> = 67 ◦C). It should be noted that the *Tm* value for the duplex **ODN7**/**DNA1** (**Y** = dG) or **RNA1** (**Y** = rG) was 54 ◦C and 57 ◦C respectively, higher than the other duplexes. Taken together, our results of the *T*<sup>m</sup> measurement suggest that AOVT might form stable triple hydrogen bonds with guanine to orient the C6-vinyl group of AOVT toward the opposite side of the complementary bases. Next, we attempted the crosslinking reactions to the nucleic acids-binding protein using the duplex DNA bearing AOVT derivative **2**.

**Figure 5.** Comparative thermal denaturing analysis of duplexes **ODN7**/**DNA1** or **RNA1**. *T*<sup>m</sup> values were measured in 50 mM MES buffer (pH 7.0) containing 100 mM NaCl.

#### *3.3. Crosslinking Reactions to DNA Methyl Transferase*

We chose cytosine-5-DNA methyl transferase (DNMT) as a model DNA binding protein. DNMT performs the methylation of cytosines for epigenetic modification of the genome that is involved in regulating many cellular processes [6]. In addition, aberrant methylation can cause many diseases, such as cancer [22–24]. The efficient inhibition of DNMT offers the possibility of interfering with the methylation process, which may allow control of the epigenetic regulation of cells and treat various cancers [25].

The mechanism of the cytosine-5 methylation with DNMT is illustrated in Figure 6A. A thiol of the Cys residue in the active site of DNMT acts as a nucleophile and attacks the C6 position of the cytosine to form an enzyme-linked intermediate. The resulting nucleophilic C5 position of the cytosine is then methylated by S-adenosyl-l-methionine (SAM). Subsequent abstraction of the proton at the 5 position and b-elimination provides the 5-substituted pyrimidine and active enzyme (Figure 6A) [26]. Based on this mechanism, the modified bases forming a covalent bond to DNMT were reported as irreversible inhibitors [27–31]. We expected that *acy*AOVT (**2**) replacement of the cytosine methylated by DNMT can react with the Cys residue in this enzyme (Figure 6B). In our investigation, we used the bacterial DNMT, i.e., the commercially-available Hha1 DNMT-1 from *Haemophilus haemolyticus.* Hha1 DNMT-1 methylated the internal 2′ -deoxycytidine in the target sequence 5′ -G**C**GC-3′ , in which AOVT was substituted in place of the internal 2′ -dC. We synthesized **ODN6**, which contained AOVT in the Hha1 DNMT-1 recognition sequence and evaluated the crosslink reactivity to DNA. The base selectivity was similar, but the reactivity to the target DNA was significantly lower compared to that of the previous sequence (Figure S12).

**Figure 6.** (**A**) Plausible mechanism for methylation by cytosine-5 methyltransferases (DNMTs) with SAM as a cofactor (**B**) Possible mechanism for covalent trapping with AOVT (**2**) derivative.

We selected DNA2 containing methyl cytosine as a target DNA because Dnmt1 methylates hemimethylated CpG sites on one strand of double-stranded DNA. The methylation conditions were determined using the duplex DNA between DNA**2** and DNA**3** contained cytosine instead of acyclic AOVT and the mono-methylated product in DNA**3** was confirmed by MALDI-TOF MS. The duplex DNA (**ODN6** and **DNA2**) was incubated with Hha1 DNMT-1 in the reaction buffer containing 5 mM 2-mercaptoethanol at 37 ◦C in the determined conditions. The reaction mixture was analyzed by sodium dodecyl sulfate −15% polyacrylamide gel electrophoresis (SDS−PAGE) and visualized with

silver staining. The slower mobility band (\*) was observed around 51kD and not observed with the non-reactive duplex DNA and Hha1 DNMT-1 (Figure 7A). We performed the reaction using the 5′ FAM-labelled **ODN6** and analyzed by SDS-PAGE. The slower mobility band stained silver matched the fluorescent band in the reaction using the 5′ FAM-labelled **ODN6** (Figure 7B) and this band was derived from the complex between Hha1 DNMT-1 and the duplex DNA. The quantification of the products was next performed by using the FlamingoTM Gel staining in-gel fluorescence analysis. We confirmed the linear correlation between the protein amount and the fluorescence intensity. Thus, the complex between the protein and ODN can be quantified by measuring the fluorescence intensity on the gel (Figure 7C). The reactions were performed at 37 ◦C and 4 ◦C. The yields were obtained by quantification of the fluorescence intensity for each band and reached 77% at 37 ◦C after 24 h.

**Figure 7.** Evaluation of the crosslinking reaction between duplex DNA and Hha1 DNMT-1 (**A**) SDS-Page gel stained silver for analysis of reactions. Duplex DNA (**ODN6** (**X** = C or *acy*AOVT) and **DNA2**: 3 µM) and Hha1 DNMT-1 (1 µM) was incubated in the presence of SAM at 4 ◦C. (**B**) 5′ -FAM labelled **ODN6** was used in the same reaction of (**A**) and the gel was visualized by silver staining and fluorescence. (**C**) SDS-Page gel stained FlamingoTM for analysis of reactions. Duplex DNA (**ODN6** (**X** = C for control or *acy*AOVT) and **DNA2**: 1 µM) and Hha1 DNMT-1 (1 µM) was incubated in the presence of SAM at 4 ◦C or 37 ◦C. (**D**) Time course of the yields at 4 ◦C or 37 ◦C.

The yields significantly decreased at 4 ◦C (Figure 7D), suggesting that the reaction rate depends on the reaction temperature. The reactions were performed using ODN**6** and DNA**4** containing 4 kinds of bases (G, A, C and T) at the complementary site for *acy*AOVT. The crosslinking yields to these targets were similar to that observed with G (Figure S13). These results suggested that DNMT can bind to the mismatch sequences and make a transition to flip out of the target *acy*AOVT to induce the crosslinking reactions. Taken together, the results of the evaluation for the reaction to Hha1 DNMT-1 indicated that the *acy*AOVT in the duplex would react with Hha1 DNMT-1 even in the presence of 2-mercaptoethanol.

#### **4. Discussion**

We have synthesized ODN containing the *acy*AOVT base **2** that is expected to react with the nucleophilic residue of the nucleic acid-binding protein in the proximity of the vinyl group. The reactivity with *acy*AOVT to DNA or RNA was significantly higher than that of *acy*AOVP. The interesting result that only one nitrogen substitution on AOVP drastically increased the reactivity could be attributed to the electron withdrawing effect of the nitrogen atom. The *acy*AOVT (**2**) showed relatively high yields to dC, dT and dG in DNA and rC, rU in RNA. The highest reactivity to cytosine with the AOVT derivatives **1** and **2** might be due to the high nucleophilicity of the amino group in cytosine and to the flexible linker, which makes it possible to access the vinyl group in the amino group of the cytosine. The cross-linking yields to adenine in the DNA or RNA with *acy*AOVT **2** were

significantly lower than that of Et-AOVT **1**. The vinyl group in *acy*AOVT **2** might be located away from the reactive site of adenine because of being positioned at similar natural base-pair by the short linker with **2**, resulting in the low crosslinking yields to adenine (Figure S14). The melting temperature (*Tm*) values of the duplex for **ODN7**/**DNA1(G)** or **RNA1(G)** were the highest in all the duplexes containing the other AOVT-base pairing. These results suggested that the duplex containing *acy*AOVT-G would be stabilized by forming three hydrogen bonding, resulting in orientation of the vinyl group in *acy*AOVT to the opposite side of the complementary bases (Figure 8).

**Figure 8.** Hypothetical structure for *acy*AOVT and guanine in duplex DNA.

We attempted the reaction to the commercially-available DNMT using the duplex containing *acy*AOVT-G as a model reaction with the nucleic acids-binding protein. The results indicated that *acy*AOVT might react with the thiol group of the cysteine residue in DNMT in close proximity as shown in Figure 9.

**Figure 9.** Predictive structure for reaction intermediate between *acy*AOVT and cysteine in *DNMT* (PDB-5MHT).

We expect that this *acy*AOVT-G base pairing system would provide a "cross-linkable duplex material" to react with the nucleic acid-binding proteins.

#### **5. Conclusions**

We have designed *acy*AOVT **2** as a reactive base to the nucleic acid-binding protein. The ODN containing **2** exhibited a higher reactivity to the pyridine bases (C, T and U) at the complementary site of **2** in duplexes DNA-DNA and DNA-RNA than that of *acy*AOVP. The results of the *T<sup>m</sup>* measurements suggested that *acy*AOVT **2** can form a stable base pair with G, and consequently, the vinyl group of **2** might be located on the opposite side of the complementary bases. The preliminary results for the reaction to DNMT with the duplex DNA containing *acy*AOVT demonstrated the potential of *acy*AOVT-G base pairing system in the reaction to the nucleic acids-binding protein. This system can be used for the crosslinking reaction to the RNA binding protein. We expect that our system will provide a useful tool for the molecular study of the RNA binding protein in control of the RNA biological functions.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2076-3417/10/21/7709/s1.

**Author Contributions:** K.O. and K.Y. conceived and designed the experiments. K.O. and S.I. performed the experiments. F.N. and H.O. wrote the study. F.N. supervised the project. All authors have read and agreed to the published version of the manuscript.

**Funding:** This study was supported by a Grant-in-Aid for Scientific Research on Innovative Areas "Middle Molecular Strategy" (No. JP15H05838) from the Japan Society for the Promotion of Science (JSPS). This work was supported in part by "Dynamic Alliance for Open Innovation Bridging Human, Environment and Materials" from the Ministry of Education, Culture, Sports, Science and Technology of Japan (MEXT).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*

## **Encapsulation of mRNA into Artificial Viral Capsids via Hybridization of a** β**-Annulus-dT20 Conjugate and the Poly(A) Tail of mRNA**

**Yoko Nakamura <sup>1</sup> , Yuki Sato <sup>1</sup> , Hiroshi Inaba 1,2, Takashi Iwasaki <sup>3</sup> and Kazunori Matsuura 1,2,\***


Received: 18 September 2020; Accepted: 9 November 2020; Published: 12 November 2020 -

**Abstract:** Messenger RNA (mRNA) drugs have attracted considerable attention as promising tools with many therapeutic applications. The efficient delivery of mRNA drugs using non-viral materials is currently being explored. We demonstrate a novel concept where mCherry mRNA bearing a poly(A) tail is encapsulated into capsids co-assembled from viral β-annulus peptides bearing a 20-mer oligothymine (dT20) at the N-terminus and unmodified peptides via hybridization of dT<sup>20</sup> and poly(A). Dynamic light scattering measurements and transmission electron microscopy images of the mRNA-encapsulated capsids show the formation of spherical assemblies of approximately 50 nm. The encapsulated mRNA shows remarkable ribonuclease resistance. Further, modification by a cell-penetrating peptide (His16) on the capsid enables the intracellular expression of mCherry of encapsulated mRNA.

**Keywords:** mRNA; poly(A) tail; artificial viral capsid; encapsulation; nanocapsule; self-assembly; β-annulus peptide; peptide-DNA conjugate

#### **1. Introduction**

Therapeutic mRNA have attracted attention in recent years as a new type of nucleic acid drugs. Such mRNAs show potential for use in protein replacement therapy and vaccination [1–6]. mRNA drugs have the advantage of inducing direct protein synthesis in the cytoplasm, thus differing from DNA therapy. Naked mRNA is; however, unstable outside of cells and is unable to effectively penetrate cell membranes due to electrostatic repulsion. Thus, various materials for mRNA delivery, such as liposomes, dendrimers, and polyion complex micelles, have been developed and are attractive for their flexibility in molecular design [2–11]. Conversely, RNA viruses package specific genome molecules inside their outer protein shell, the viral capsid [12,13]. Viral capsids are also attractive for precise mRNA packaging and delivery but the use of natural viruses for mRNA delivery has safety concerns. Virus-like artificial protein cages for RNA packaging are, therefore, attractive. Recently, Hilvert and coworkers demonstrated that a non-viral protein cage formed by *Aquifex aeolicus* lumazine synthase selectively packages mRNA via cationic peptide tags [14,15].

Eukaryotic mRNA possesses a long adenine nucleotide (poly(A)) tail of about 100~200 nt at the 3′ -end to regulate translation [16–21]. We propose a novel strategy for the encapsulation of mRNA into virus-like peptide nanocapsules via hybridization between dT<sup>20</sup> and the poly(A) tails. We developed an "artificial viral capsid" with a size of 30–50 nm self-assembled from the β-annulus

peptide (INHVGGTGGAIMAPVAVTRQLVGS) that participates in the formation of the dodecahedral internal skeleton of the tomato bushy stunt virus [22–35]. Artificial viral capsids encapsulate anionic guest molecules and His-tagged green fluorescence protein (GFP) into the cationic and Ni-NTA (Ni-nitrilotriacetic acid complex) modified interior [25–28]. Modification of the C-terminal, expected to be directed to the exterior of capsids, enabled the surface modification of artificial viral capsids with gold nanoparticles, coiled-coil peptides, single-stranded DNAs, and proteins [29–34]. Recently, we demonstrated that β-annulus peptide possessing ssDNA at the N-terminal self-assembled into ssDNA-encapsulated artificial viral capsids [35]. Thus, we designed a β-annulus peptide with dT<sup>20</sup> at the N-terminal to direct mRNA to the interior of capsids via hybridization with poly(A) tails (Figure 1). *β β*

*β β β β* **Figure 1.** Schematic of synthesis of dT<sup>20</sup> modified β-annulus peptide (dT20-SS-β-annulus) and formation of an artificial viral capsid by co-assembly of β-annulus peptide and dT20-SS-β-annulus hybridized with mRNA.

#### **2. Materials and Methods**

## *2.1. General*

*α α* ′ Ω Reverse-phase HPLC was performed at ambient temperature with a Shimadzu LC-6AD liquid chromatograph equipped with a UV–Vis detector (220 nm and 260 nm, Shimadzu SPD-10AVvp, Kyoto, Japan) using an Inertsil WP300 C18 column (GL Science, 250 mm × 4.6 mm or 250 × 20 mm). MALDI-TOF mass spectra were obtained on an Autoflex T2 (Bruker Daltonics, Billerica, USA) in linear/positive mode with α-cyano-4-hydroxycinnamic acid (α-CHCA) or 3-hydroxypicolinic acid (3-HPA) with diammonium hydrogen citrate as the matrix. The UV–Vis spectra of DNA-conjugated peptides were measured at 260 nm using a Jasco V-630 (JASCO Corporation, Tokyo, Japan) with a quartz cell (S10-UV-1, GL Science). An mRNA coding mCherry fluorescent protein (mCherry mRNA, ~1 kb) with a 3′ -polyadenylic acid (poly(A)) tail was purchased from OZ Biosciences (Marseille, France). RNase A was purchased from Nacalai Tesque (Kyoto, Japan). All other reagents were obtained from a commercial source and used without further purification. Deionized water of high resistivity (>18 MΩ cm) was purified using a Millipore Purification System (Milli-Q water, Merck Millipore, Burlington, USA) and used as a solvent.

*β β*

#### *2.2. Preparation of dT20-SS-*β*-Annulus and* β*-Annulus Peptides*

β-annulus (INHVGGTGGAIMAPVAVTRQLVGS) and Cys-β-annulus (CINHVGGTGGAIMA PVAVTRQLVGS) peptides were synthesized with a Biotage Initiator<sup>+</sup> (Biotage, Uppsala, Kingdom of Sweden) using standard Fmoc-based coupling chemistry as previously described [22,28]. MALDI-TOF-MS of the β-annulus peptide (matrix: α-CHCA): *m*/*z* = 2306 (exact mass: 2306) and of the Cys-β-annulus peptide (matrix: α-CHCA): *m*/*z* = 2409 (exact mass: 2409).

An amine-modified dT<sup>20</sup> (5′NH2-(CH2)6-TTTTTTTTTTTTTTTTTTTT-3′ , Gene Design Inc., Osaka, Japan) in 50 mM sodium phosphate buffer (pH 8.0) was added to 20-fold molar excess of *N*-succinimidyl 3-(2-pyridyldithio)propionate (SPDP) in acetonitrile and the mixture was incubated for 1 h at 25 ◦C. A Spectra/por7 membrane with a cutoff Mw 1000 (Spectrum Laboratories, Inc., Rancho Dominguez, USA) was used for dialysis against water for 24 h. The internal solution was lyophilized to afford a flocculent solid. This product was dissolved in water and the concentration was defined by UV–Vis spectroscopy. Product (PySS-dT20) yield was 43.6 nmol (94.8%)—MALDI-TOF-MS (matrix: 3-HPA): *m*/*z* = 6409 (exact mass: 6398).

An aqueous solution of PySS-dT<sup>20</sup> (0.1 mM) in a 50 mM sodium phosphate buffer (pH 7.2) was mixed with the Cys-β-annulus peptide (2 mM) in a 50 mM sodium phosphate buffer (pH 7.2) and the mixture was incubated for 24 h at 25 ◦C. The solution was purified by reverse-phase HPLC eluted with a linear gradient of CH3CN/0.1 M ammonium formate aqueous solution (10/90 to 100/0 over 95 min). The elution fraction was collected, concentrated in a centrifugal evaporator, and dialyzed as above against water for 20 h. The internal solution was lyophilized to afford a flocculent solid. This product was dissolved in water and the concentration was defined by UV–Vis spectroscopy. Product (dT20-SS-β-annulus peptide) yield was 12.6 nmol (27.4%)—MALDI-TOF-MS (matrix: 3-HPA): *m*/*z* = 8710 (exact mass: 8698).

#### *2.3. Preparation and Characterization of Artificial Viral Capsids*

Lyophilized dT20-SS-β-annulus and β-annulus peptides were dissolved in water, respectively. Peptide solutions were mixed at molar ratios of dT20-SS-β-annulus:β-annulus = 1:9, 1:4.5, 1:2, and 1:1, sonicated for 5 min, and lyophilized. Stock solutions of peptides were prepared by dissolving peptide powders in 1× phosphate buffered saline (PBS, pH 7.4) and sonicating for 3 min.

Dynamic light scattering (DLS) was measured with a Zetasizer NanoZS (Malvern, Kobe, Japan) instrument at 25 ◦C using an incident He-Ne laser (633 nm). Correlation times of scattered light intensities G(τ) were measured several times and the means were fitted to Equation (1), where *B* is the baseline, *A* is amplitude, *q* is the scattering vector, τ is the delay time, and *D* is the diffusion coefficient.

$$G(\tau) = B + A \exp(-2q^2 D\tau) \tag{1}$$

Hydrodynamic radii (*RH*) of scattering particles were calculated using the Stokes-Einstein Equation (2), where η is the solvent viscosity, *k<sup>B</sup>* is Boltzmann's constant, and *T* denotes absolute temperature.

$$R\_H = k\_B T / 6\pi\eta D \tag{2}$$

Transmission electron microscopy (TEM) images were obtained with a JEOL JEM 1400 Plus (JEOL Ltd., Tokyo, Japan), using an accelerating voltage of 80 kV. Aliquots (5 µL) of DLS samples were applied to hydrophilized carbon-coated Cu-grids (C-SMART Hydrophilic TEM girds, Alliance Biosystems, Osaka, Japan) for 60 s and then removed. Subsequently, the TEM grid was instilled in the aqueous solution (5 µL) of 2% phosphotungstic acid, Na3(PW12O40)(H2O)n. The staining solution was removed after 60 s and the sample-loaded grids were dried *in vacuo*.

## *2.4. Preparation of Complex of dT20-SS-*β*-Annulus:*β*-Annulus Peptide Hybridized with mCherry mRNA*

An aqueous solution of mCherry mRNA in 1× PBS (1 mg/mL, (nucleotide) ≈ 3 mM, and pH 7.4) was mixed with dT20-SS-β-annulus:β-annulus peptide powders by gentle pipetting without sonication. The peptide solutions were diluted in 1× PBS (pH 7.4) so that the nucleotide concentration of mCherry mRNA was equal to dT<sup>20</sup> ((mRNA nt) = (T) = 1 mM, (dT20) = 50 µM, (RNA) = 1 µM). In a typical experiment, the dT20-SS-β-annulus:β-annulus/mRNA solution was incubated for 30 min at 25 ◦C.

#### *2.5. Electrophoretic Mobility Shift Assay and Nuclease Resistance Assay*

Electrophoretic mobility shift assay (EMSA) was employed for the detection of peptide complexes with nucleic acids. The dT20-SS-β-annulus:β-annulus/mRNA solutions were loaded onto 3% *w*/*v* agarose gels in a TAE buffer, pre-cast with GelRed (Wako Pure Chemical Industries, Osaka, Japan) for nucleic acid detection. Three microliters of the dT20-SS-β-annulus:β-annulus/mRNA solution ((mRNA nt) = (T) = 1 mM) was loaded with 5 µL of Bluejuice Gel loading buffer (Thermo Ficher, Waltham, USA). Electrophoresis used 210 V for 30 min on an Atto AE-6100 with Atto mypower II300 AE-8130 (Atto, Tokyo, Japan). The bands were visualized with a UV illuminator (TP-15 MP, Atto, Tokyo, Japan) and the images were recorded with a digital camera.

Ribonuclease A from bovine pancreas (Nacalai Tesque, Kyoto, Japan) is an endoribonuclease which specifically hydrolyzes the 3′ end of pyrimidine residues in single-stranded RNA. Each peptide/mRNA solution was incubated with a solution of RNase A (3 µL, 1 U/µL) in PBS for 10–60 min at 37 ◦C. Enzymatic degradation of mRNA was evaluated by EMSA.

#### *2.6. Confocal Laser Scanning Microscopy (CLSM) Measurements of In-Cell Expression of mCherry mRNA*

Confocal laser scanning microscopy (CLSM) used a Fluo View FV10i (Olympus, Tokyo, Japan). Human hepatoma HepG2 cells were cultured in DMEM (10 v/v% FBS, 100 µg/mL streptomycin, 100 units/mL penicillin, 1 mM sodium pyruvate, and 1 v/v% MEM nonessential amino acids) for 24 h at 37 ◦C in a 5% CO<sup>2</sup> atmosphere. The cells were seeded onto single-well bottom dishes with 2.0 × 10<sup>4</sup> cells/well in a final volume of 100 µL and incubated for 24 h at 37 ◦C and 5% CO2. The medium was removed and 50 µM dT20-SS-β-annulus, 450 µM β-annulus, and 1 µM mCherry mRNA in the fresh medium were added to the cells, then incubated for 48 h under 5% CO2. The co-assembly mixture of 50 µM dT20-SS-β-annulus, 450 µM β-annulus-His16, and 1 µM mCherry mRNA was added to the cells for evaluation of expression of mCherry fluorescent protein. A complex of 2 µM TransIT-mRNA (Transfection kit, Mirus Bio LLC, Madison, USA) with 1 µM mCherry mRNA was added to cells as a positive control. CLSM images of mCherry fluorescence were obtained with excitation at 587 nm and an mCherry band-pass filter (Red). Fluorescence intensity was measured by cell from CLSM images by subtracting background intensity using ImageJ 1.51 software.

#### **3. Results and Discussion**

#### *3.1. Synthesis and Self-Assembling Behavior of* β*-Annulus Peptide Bearing dT<sup>20</sup> at the N-Terminus*

We designed a β-annulus peptide bearing dT<sup>20</sup> at the N-terminus that was directed to the interior of the capsid (Figure 1) to encapsulate mRNA via hybridization between dT<sup>20</sup> and the poly(A) tail at the 3′ end of mRNA. A 5′ -terminal aminated 20-mer thymidine nucleotide (dT20-NH2) was reacted with *N*-succinimidyl 3-(2-pyridyldithio)propionate (SPDP) to obtain a dT<sup>20</sup> bearing pyridyl disulfide group at the 5′ -end (PySS-dT20). A Cys-β-annulus peptide (CINHVGGTGGAIMAPVAVTRQLVGS) containing Cys at the N-terminus was synthesized by standard Fmoc-based solid-phase methods according to our reported procedure [28]. PySS-dT<sup>20</sup> was attached to the Cys-β-annulus peptide with a disulfide-exchange reaction. A reversed-phase HPLC chart of the reaction mixture showed one peak at 26.7 min, which was different from the retention time of dT20-NH<sup>2</sup> and PySS-dT<sup>20</sup> (Figure 2A). The purified product was assigned by MALDI-TOF-MS as dT20-modified β-annulus peptide (dT20-SSβ-Annulus) (Figure 2B). DLS of a 50 µM solution of dT20-SS-β-annulus in PBS (pH 7.4) exhibited

formation of 77 ± 21 and 40 ± 8 nm assemblies (Figure 2C). TEM images of the aqueous solution stained with phosphotungstic acid showed the formation of spherical assemblies of approximately 40 nm in diameter (Figure 2D). The concentration dependence of the size distribution in DLS measurement indicated that dT20-SS-β-annulus formed assemblies with sizes ranging from 30 to 100 nm in a concentration range of 5–125 µM (Figure S1). This tendency was similar to the findings in our previous report [35]. Notably, β-annulus peptide modified with highly charged dT<sup>20</sup> formed relatively stable artificial viral capsids. *β*

*β β* **Figure 2.** (**A**) Reverse-phase HPLC chart of (**a**) dT20-NH<sup>2</sup> , (**b**) PySS-dT<sup>20</sup> detected at 260 nm, eluted with a linear gradient of CH3CN/0.1 M NH4HCO<sup>2</sup> aq (0/100 to 100/0 over 95 min), and (**c**) dT20-SS-β-annulus detected at 260 nm, eluted with a linear gradient of CH3CN/0.1 M NH4HCO<sup>2</sup> aq (10/90 to 100/0 over 95 min). (**B**) MALDI-TOF-MS of the purified dT20-SS-β-annulus (matrix: 3-HPA). (**C**,**D**) TEM image and size distribution obtained from dynamic light scattering (DLS) for an aqueous solution of 50 µM dT20-SS-β-annulus in PBS (pH 7.4) at 25 ◦C.

#### *β 3.2. Complexation of mCherry mRNA and Artificial Viral Capsid Bearing dT<sup>20</sup>*

*β β β β β β β β β* We initially attempted encapsulation of mRNA within the artificial viral capsid consisting of dT20-SS-β-annulus via hybridization between dT<sup>20</sup> and mRNA. However, when poly(A) and dT20-SS-β-annulus were mixed at equimolar base concentrations, capsid formation was insufficient (data are not shown). It is probably difficult to self-assemble among dT20-SS-β-annulus peptides hybridized on a large poly(A) molecule due to the excluded volume effect. Therefore, we developed an alternate strategy to encapsulate mRNA with dT20-SS-β-annulus and unmodified β-annulus peptide (INHVGGTGGAIMAPVAVTRQLVGS) (Figure 1). DLS and TEM images of the mixture of dT20-SS-β-annulus/unmodified β-annulus peptide at 1:9 molar ratio in PBS showed the formation of spherical co-assemblies of size 45 ± 20 nm (Figure 3). The peptides mixture at other molar ratios (dT20-SS-β-annulus:β-annulus = 1:4.5, 1:2, 1:1) also formed spherical co-assemblies of similar size (Figure S2). Next, we encapsulated mRNA coding in the mCherry fluorescent protein (mCherry mRNA) by co-assembly at different molar ratios of dT20-modified and unmodified β-annulus peptide.

*β*

*β*

*β*

*β*

*β*

*β*

*β β*

*β*

*β*

*β*

*β*

*β*

*β*

*β*

**Figure 3.** (**A**) Size-distributions obtained from DLS and (**B**) TEM images for co-assembly peptides (dT20-SS-β-annulus:β-annulus <sup>=</sup> 1:9) in water at 25 ◦C. *β β*

An aqueous solution of mCherry mRNA in PBS was added to the lyophilized powder of dT20-SS-β-annulus:β-annulus peptide followed by dilution with PBS and incubation for 30 min at 25 ◦C to prepare capsids containing mCherry mRNA. The electrophoretic mobility shift assay (EMSA) showed a band of mCherry mRNA significantly shifted by complexation with a co-assembly mixture of dT20-SS-β-annulus:β-annulus peptide at a 1:9 molar ratio (Figure 4A, lane 4). In contrast, the other co-assembly mixtures showed only a slight retardation of mRNA (Figure 4A, lane 5–7), which reflect almost neutral net charge and relatively small size of the β-annulus peptide. The migration position of a mixed solution of unmodified β-annulus peptide and mCherry mRNA was not shifted from mCherry mRNA alone (Figure 4B). DLS and TEM images of the 1:9 co-assembly complexes with mCherry mRNA show the formation of spherical assemblies of size 52 ± 10 nm (Figure 5A,B). We previously reported that the N-terminus of the β-annulus peptide is directed toward the interior of the capsid and the surface ζ-potential of the capsid is almost zero at neutral pH [25]. Thus, when negatively charged RNA is encapsulated at neutral pH, the artificial viral capsid should minimally migrate from the applied position due to the capsid charge shield. Therefore, EMSA results indicate that the dT20-SS-β-annulus:β-annulus peptide 1:9 co-assembly encapsulated mCherry mRNA via hybridization with dT<sup>20</sup> directed to the interior of capsids. EMSA shows that mCherry mRNA could be complexed with the 1:9 co-assembly at 4 ◦C, 25 ◦C, 37 ◦C, and 60 ◦C (Figure S3). However, complexation at 4 ◦C and 60 ◦C afforded large aggregates with bimodal size distribution (Figure S4). Thus, mCherry mRNA-encapsulated artificial viral capsid with unimodal size distribution can be obtained by incubation with the 1:9 co-assembly mixture at temperatures of 25 ◦C–37 ◦C. *β β β β β β β* ζ *β β*

*β* **Figure 4.** Gel shift assay of (**A**) co-assembly peptides hybridized with mCherry mRNA, (**B**) mixture of β-annulus peptide and mCherry mRNA in phosphate buffered saline (PBS) (pH 7.4) incubated at 25 ◦C for 30 min. M: DNA marker. All DNA markers used in the gel shift assay are the same. The gels were stained with GelRed.

*β*

*β β*

*β β*

*β β*

*β*

*β*

*β*

ζ

*β β*

**Figure 5.** (**A**) Size-distributions obtained from DLS and (**B**) TEM images for co-assembly peptides (dT20-SS-β-annulus:β-annulus = 1:9) and mCherry mRNA in PBS (pH 7.4) obtained after incubation at 25 ◦C. *β β*

#### *3.3. Nuclease Resistance of mRNA-Encapsulated Artificial Viral Capsid*

Resistance of mCherry mRNA-encapsulated artificial viral capsid to endoribonuclease (RNase A) was confirmed by EMSA. Naked mCherry mRNA was digested by RNase A within 10 min (Figure 6(Aa),B lane 3). Conversely, mRNA in artificial viral capsids (band near the loading well) was minimally digested after 60 min (Figure 6(Ac),B lane 5–8) due to protection by the capsid. For the artificial viral capsids modified with dT20, the exteriors were constructed as a control by self-assembly of β-annulus peptide modified with dT<sup>20</sup> at the C-terminal that directed dT<sup>20</sup> to the exterior of the capsid. The mixture of mCherry mRNA and the dT20-modified capsid at the exterior did not cause a significant mobility shift (Figure 6B lane 9), indicating that mRNA was not encapsulated. mRNA was rapidly digested by RNase A within 10 min (Figure 6(Ac),B lane 10). Thus, mRNA is protected from degradation by artificial viral capsids. *β*

*β β β* **Figure 6.** (**A**) Illustration of digestion of mRNA by RNase A: (**a**) mRNA alone, (**b**) mRNA-encapsulated artificial viral capsid, and (**c**) Mixture of mRNA and capsid modified with dT20 at the exterior. (**B**) Gel electrophoresis assay of digestions mRNA by RNase A at 37 ◦C for 10 min (mRNA alone and β-annulus-dT20) and 10–60 min (dT20-SS-β-annulus:β-annulus = 1:9 co-assembly). The gel was stained with GelRed.

#### *3.4. In-Cell Expression of mCherry mRNA Encapsulated in Artificial Viral Capsid*

Uptake of mCherry mRNA-encapsulated capsids into human hepatoma HepG2 cells were analyzed by CLSM to evaluate mCherry mRNA expression inside cells. mCherry mRNA and a strong transfection reagent (TransIT-mRNA) were incubated with the cells for 48 h at 37 ◦C as a positive control; significant red fluorescence of mCherry was observed (Figure 7(Aa)). Unfortunately,

*β β β*

minimal expression of encapsulated mCherry mRNA was observed as similar to the naked mCherry mRNA(Figure 7A(b,c)). We employed a histidine 16-mer (His16), a cell-penetrating peptide uncharged under physiological conditions [36,37] to enhance cell permeability of capsids. We constructed artificial viral capsids modified with His16 on the exterior and using the 1:9 co-assembly of dT20-SS-β-annulus peptide and β-annulus peptide modified with His16 at the C-terminal (β-annulus-His16). Significant red fluorescence of mCherry was observed in cells incubated with His16-modified capsids (Figure 7(Ad)). The expression level of mCherry was quantified by fluorescence intensity in the cell areas of the CLSM image using the software ImageJ. The expression level using His16-modified capsids was 26.5 times the expression observed with unmodified capsids (Figure 7B). This indicates that the intracellular expression of mCherry was significantly enhanced by the modification of His16 on the surface of the artificial viral capsid. We have demonstrated in our previous work that the reductive cleavage of the disulfide bond of the artificial viral capsid by dithiothreitol (DTT) caused the controlled-release of ssDNA with the destruction of the capsid [35]. mCherry mRNA was likely released from capsids in the reducing environment in cells.

*β β β β* **Figure 7.** (**A**) Confocal laser scanning microscopy (CLSM) images of HepG2 Cells transfected with mCherry mRNA encapsulated in artificial viral capsids: (**a**) positive control (2 µM TransIT-mRNA with 1 µM mCherry mRNA), (**b**) 1 µM Naked mCherry mRNA, (**c**) mRNA-encapsulated artificial viral capsid (50 µM dT20-SS-β-annulus, 450 µM β-annulus, and 1 µM mCherry mRNA), and (**d**) mRNA-encapsulated histidine 16-mer (His16)-artificial viral capsid (50 µM dT20-SS-β-annulus, 450 µM β-annulus-His16, and 1 µM mCherry mRNA). (**B**) Relative fluorescence intensity of mCherry from HepG2 Cells.

#### **4. Conclusions**

We constructed artificial viral capsids co-assembled from viral β-annulus peptides bearing dT<sup>20</sup> at the N-terminus and unmodified peptides. Capsids bearing interior dT<sup>20</sup> enable encapsulation of mCherry mRNA bearing poly(A) tails via hybridization to form ribonuclease resistant spherical assemblies of approximately 50 nm in diameter. Further, we demonstrate novel material for mRNA delivery by His16-modification on the exterior of capsids. Further research will demonstrate potential applications for artificial viral capsids as delivery systems for nucleic acid drugs.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2076-3417/10/22/8004/s1, Figure S1: Concentration dependence of size distribution obtained from DLS for the aqueous solution of dT20-SS-β-annulus in PBS (pH7.4) at 25 ◦C, Figure S2: Size-distributions obtained from DLS and TEM images for co-assembly peptides (dT20-SS-β-annulus:β-annulus <sup>=</sup> 1:4.5 (a), 1:2 (b), and 1:1 (c) in water at 25 ◦C), Figure S3: Gel shift assay obtained after hybridization of co-assembly peptides and mCherry mRNA in PBS (pH 7.4) incubated at various temperatures for 10–30 min, and Figure S4: Size-distributions obtained from DLS and TEM images for co-assembly peptides (dT20-SS-β-annulus:β-annulus = 1:9) and mCherry mRNA in PBS buffer (pH 7.4) obtained after incubation at 4 ◦C (a), 37 ◦C (b), and 60 ◦C (c) for 10–30 min.

**Author Contributions:** Conceptualization, K.M.; methodology, T.I. and H.I.; investigation, Y.N. and Y.S.; writing, Y.N. and K.M.; visualization, Y.S. and H.I.; supervision, K.M.; funding acquisition, K.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by a Grant-in-Aid for Scientific Research on Innovative Areas "Chemistry for Multimolecular Crowding Biosystems" (grant number: JP18H04558) and a Grant-in-Aid for Scientific Research (B) (grant number: JP18H02089).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Communication*

## **Bifunctional Aptamer Drug Carrier Enabling Selective and E**ffi**cient Incorporation of an Approved Anticancer Drug Irinotecan to Fibrin Gels**

### **Hiroto Fujita, Yuka Kataoka and Masayasu Kuwahara \***

Graduate School of Integrated Basic Sciences, Nihon University, 3-25-40 Sakurajosui, Setagaya-ku,

Tokyo 156-8550, Japan; fujita.hiroto@nihon-u.ac.jp (H.F.); kataoka.yuka@nihon-u.ac.jp (Y.K.)

**\*** Correspondence: mkuwa@chs.nihon-u.ac.jp; Tel.: +81-03-5317-9398

Received: 10 September 2020; Accepted: 4 December 2020; Published: 7 December 2020

**Abstract:** We have previously developed a bifunctional aptamer (bApt) binding to both human thrombin and camptothecin derivative (CPT1), and showed that bApt acts as a drug carrier under the phenomenon named selective oligonucleotide entrapment in fibrin polymers (SOEF), which enables efficient enrichment of CPT1 into fibrin gels, resulting in significant inhibition of tumor cell growth. However, although the derivative CPT1 exhibits anticancer activity, it is not an approved drug. In this study, we evaluated the binding properties of bApt to irinotecan, a camptothecin analog commonly used for anticancer drug therapy, in addition to unmodified camptothecin (CPT). Furthermore, we have revealed that irinotecan binds to bApt like CPT1 and is selectively concentrated on fibrin gels formed around the tumor cells under the SOEF phenomenon to suppress cell proliferation.

**Keywords:** drug delivery system; anticancer drug; camptothecin derivative; irinotecan

#### **1. Introduction**

Highly water-soluble derivatives of camptothecin, a type of quinoline alkaloid (irinotecan and topotecan), are currently known to have a broad spectrum of anticancer activity such as in lung cancer, colorectal cancer, ovarian cancer, and malignant lymphoma [1–7]. Its derivatives are the most clinically used because they bind to type I topoisomerase and inhibit recombination with DNA, inhibiting DNA synthesis, causing cancer cell apoptosis, and resulting in an extremely potent anticancer activity [4]. Irinotecan is biotransformed because of the action of carboxylesterase in SN-38 (7-ethyl-10-hydroxy-20(S)-camptothecin), which lacks a substituent at the 10-position on the piperidine ring. Additionally, the cytotoxicity of SN-38 is 100–1000 times greater than that of irinotecan (Figure S1) [8–10]. However, it causes serious side effects, and there is demand for developing a mechanism such as a drug delivery system (DDS) to deliver specific drugs and reduce dosages [11–16]. For example, a liposomal irinotecan called Onivyde has been reported and approved for medical use [17,18].

Previously, we have reported that thrombin-binding aptamer (TBA) is selectively and efficiently incorporated into the gel through thrombin during the fibrin gel formation known as the blood coagulation reaction [19–24]. In the fibrin gel formation process, first, thrombin cleaves FpA and FpB from the Aα and Bβ chains located at the N-terminus of fibrinogen to produce fibrin monomers [25]. The resulting fibrin monomer engulfs thrombin and polymerizes to form a gel. Therefore, it was suggested that by using this phenomenon named selective oligonucleotide entrapment in fibrin polymers (SOEF), the desired substance or functional group could be introduced into the fibrin gel with TBA [26].

Recently, it has been shown that fibrin plays an important role when cancer cells metastasize or invade other organs, and that fibrinogen is abundant in the cancer stroma [11]. When considering SOEF, the bifunctional aptamer (bApt), which is a combined 29-mer TBA and camptothecin binding aptamer, can deliver and condense camptothecin as a potent anticancer drug around cancer cells and efficiently suppress their cell growth (Figure 1). Previously, we developed a camptothecin (CPT)-binding modified DNA aptamer (CMA-70), which is a base-modified DNA aptamer obtained by the systematic evolution of ligands via the exponential enrichment (SELEX) method targeting the camptothecin derivative (CPT1) (Figure 1) [27–29]. Therefore, the ability to selectively introduce CPT1 into fibrin gel using bApt as a drug carrier under the SOEF phenomenon provides a new strategy for cancer chemotherapy [30]. However, as a shortcut to the practical application of the said DDS, it is desirable to use an approved anticancer drug such as irinotecan, and thus, this study was conducted.

**Figure 1.** Design of bifunctional aptamer (bApt) conjugated with 29-mer thrombin-binding aptamer (TBA) and a camptothecin (CPT)-binding modified DNA aptamer (CMA-70) containing modified nucleic acids (2′ -deoxyuridine-5′ -phosphate (dUadTP)) represented with ′ t.′ Red boxes indicate the conjugation site.

#### **2. Materials and Methods**

#### *2.1. Materials*

To synthesize CMA-70 and bApt, oligonucleotides, nucleoside triphosphates (dATP, dGTP, dCTP), and *KOD dash* DNA polymerase were, respectively, purchased from Japan Bio Services Co. Ltd. (Saitama, Japan), Roche Diagnostics K. K. (Tokyo, Japan), and Toyobo Co. Ltd. (Tokyo, Japan). For the fluorescence polarization assay, human fibrinogen and thrombin were purchased from Merck K. K. (Tokyo, Japan) and irinotecan and CPT were supplied by Tokyo Chemical Industry Co., Ltd. (Tokyo, Japan). For the cell cultivation and cell growth inhibition assay, Dulbecco's Modified Eagle's Medium (DMEM) low glucose were purchased from Wako Pure Chemical Industries, Ltd., fetal bovine serum (GibcoTM) was purchased from Thermo Fischer Scientific K. K. (Tokyo, Japan), CellTrackerTM Green CMFDA (5-chloromethylfluorescein diacetate) Dye was purchased from Life Technologies Japan, Ltd. (Osaka, Japan), and HeLa cells (JCRB9004) were purchased from the National Institutes of Biomedical Innovation, Health and Nutrition (Osaka, Japan). All other reagents were of research grade.

#### *2.2. Enzymatic Synthesis of CMA-70 and bApt*

The CMA-70 aptamer was synthesized by one-primer PCR using a primer (CMA-70\_P1), a template (CMA-70\_Temp), three 2′ -deoxyribonucleoside triphosphates (dATP, dGTP, and dCTP), and a

modified 2′ -deoxyuridine-5′ -triphosphate (dUadTP) with *KOD Dash* DNA polymerase (Table S1) [27,29]. The antisense strand of 5′ -monophosphate-labeled ODNs (CMA-70\_Temp) was selectively degraded through λ-exonuclease treatment. Then, the resulting CMA-70 was purified via polyacrylamide gel electrophoresis.

The aptamer bApt was synthesized by one-primer PCR using a primer (TBA\_P1), a template (T1), three 2′ -deoxyribonucleoside triphosphates (dATP, dGTP, and dCTP), and a modified 2 ′ -deoxyuridine-5′ -triphosphate (dUadTP) with *KOD Dash* DNA polymerase (Table S1) [30]. The synthesized bApt was purified by polyacrylamide gel electrophoresis (Figure S2).

#### *2.3. Fluorescence Polarization Assay for CMA-70 Versus CPT Derivatives*

To analyze target binding specificity, a fluorescence polarization assay for a target (CPT1, irinotecan, and CPT; final concentration of 0.10 µM) with increasing concentrations of CMA-70 (final concentrations of 0, 0.010, 0.025, 0.050, 0.075, 0.10, 0.50, and 1.0 µM) were performed at 25 ◦C using an LS-55 fluorescence spectrometer.

First, CMA-70 was dissolved in 1 × phosphate-buffered saline (PBS) (11.8 mM HPO<sup>4</sup> <sup>2</sup>−, 140 mM Cl−, 157 mM Na+, 4.5 mM K+; pH 7.4) at appropriate concentrations (2.0 µM), refolded by denaturing at 94 ◦C for 0.5 min to protect modified nucleic acids and subsequently cooled to 25 ◦C at a rate of 0.5 ◦C/min. Each CMA-70 solution (50 µL each) was mixed with 50 µL of a target solution (CPT1, irinotecan and CPT; 0.20 µM) in a PBS buffer, and incubated at 25 ◦C for 1 h. Fluorescence polarization for each mixture described above was recorded every 20–30 s for 20 min with excitation at 372 nm and monitoring at 456 nm of 25 ◦C. Thus, fluorescence polarization at 8 aptamer concentrations was generated as curve (Figure 2).

#### *2.4. Fluorescence Polarization Assay for Time Course Analyses of bApt Complexes*

Thrombin was dissolved in distilled water, and then, a 2.0 µM thrombin solution in 1 × PBS was prepared. Similarly, mother liquors of fibrinogen (20 µM) and irinotecan (1.0 µM) were prepared in 1 × PBS.

A solution containing bApt (1.0 µM) was prepared in 1 × PBS, and then, bApt was refolded by annealing (preheating at 94 ◦C for 0.5 min followed by cooling to 25 ◦C at a rate of 0.5 ◦C/min). First, the irinotecan solution (63 µL, 1.1 µM) was placed in a 1 cm cuvette and a started fluorescence polarization measurement. After 20 min, the bApt solution (7 µL, 1.0 µM) was put in the same cuvette. Subsequently, the thrombin solution (3.5 µL, 2.0 µM) was added to produce a ternary complex of thrombin, irinotecan, and bApt after 40 min of FP measurement. Finally, fibrinogen solution (3.5 µL, 20 µM) was added after 60 min of FP measurement and measured FP for another 20 min (Figure 3). The final concentration of thrombin/irinotecan/bApt was 0.10 µM and fibrinogen was 1.0 µM.

#### *2.5. Cell Cultivation and Cell Growth Inhibition Assay*

Trypsinized HeLa cells were diluted to a concentration of 1.0 × 10<sup>4</sup> cells/mL with a fresh medium containing DMEM-low glucose, a heat-inactivated serum (GibcoTM, FBS, US origin), and an antibiotic solution (penicillin–streptomycin) at a percentage proportion of 90/9/1 (v/v/v%). The medium (300 µL) containing the cells was transferred to a 96-well tissue culture plate. Then, the cells were cultured in a humidified incubator containing 95% air and 5% CO<sup>2</sup> at 37 ◦C for 24 h. After 240 µL of the medium was removed, 240 µL of the new medium was added to each well. Then, 30 µL of the fibrinogen solution (10 µM) was added and mixed via gentle pipetting. Subsequently, the solution of the ternary thrombin/CPT1/bApt complex (30 µL, 1.0 µM) was added and mixed by gentle pipetting. Solutions of the binary thrombin/TBA complex and free CPT1 (30 µL, 1.0 µM each) were used as controls. Similarly, the abovementioned solutions for the other control experiments were examined. Furthermore, control experiments without fibrinogen (replaced with 1× PBS) were conducted. The cells were cultured in a humidified incubator containing 95% air and 5% CO<sup>2</sup> at 37 ◦C for 48 h.

Next, 300 µL of the medium was replaced with 100 µL of medium containing DMEM-low glucose and CellTrackerTM Green CMFDA dye in dimethyl sulfoxide (10 mM) at a percentage proportion of 99.8/0.2 (v/v%). The sample was then incubated in a humidified incubator containing 95% air and 5% CO<sup>2</sup> at 37 ◦C for 30 min. After 300 µL of the medium was replaced with 300 µL of 1× PBS, the stained cell layer was examined under a BZ-X710 inverted fluorescence microscope to acquire fluorescence images (40×) using excitation light (480 nm) and a cut-off filter (<520 nm) (Figure 4). Three independent experiments were performed.

#### **3. Results**

#### *3.1. Fluorescence Polarization Assay Using CMA-70*

The fluorescence polarization measurement was conducted to examine the binding properties to CPT1, irinotecan, and camptothecin (CPT) (Figure 2A–C) using CMA-70, which is the mother aptamer of the camptothecin-derivative-binding site in bApt. From Figure 2D, we revealed that the fluorescence polarization degree (∆FP) of irinotecan increases depending on the CMA-70 concentration, as in CPT1 and CPT. It was indicated that the affinity of CMA-70 for irinotecan exceeded that of CPT.

**Figure 2.** Chemical structures of (**A**) camptothecin derivative (CPT1), (**B**) irinotecan, and (**C**) camptothecin (CPT) and (**D**) titration curves for CPT1 (•), irinotecan (N), and CPT () polarization versus CMA-70 concentration; polarization was monitored at 456 nm using the excitation wavelength of CPT1, irinotecan, and CPT (372 nm).

#### *3.2. Fluorescence Polarization Assay Using bApt*

Next, fluorescence polarization measurements of irinotecan were performed to observe the stepwise binding of bApt to its target and the uptake into fibrin gels (Figure 3). Figure 3A shows the time course in differential fluorescence polarization degrees by (i) addition of bApt to irinotecan, (ii) addition of thrombin, and (iii) addition of fibrinogen, followed by polymerization (gelation) of the fibrin monomer generated by thrombin cleavage activity. From Figure 3B, irinotecan bound to the CMA moiety of bApt, as the addition of bApt increased the fluorescence polarization by approximately 4 mP. Furthermore, by adding thrombin, thrombin bound to the TBA moiety of bApt, which increased the fluorescence polarization by approximately 18 mP. Finally, when fibrinogen was added, the fluorescence polarization degree increased by approximately 120 mP, indicating that irinotecan was incorporated into the gel along with bApt during the process of fibrin gel formation.

**Figure 3.** Mechanism of in situ condensation in fibrin gel based on the selective oligonucleotide entrapment in fibrin polymers (SOEF) phenomenon; entrapment of (**A**) the thrombin/irinotecan/bApt complex during the fibrin gel formation. (**B**) The titration curves for irinotecan polarization versus bApt concentration; polarization was monitored at 456 nm using an excitation wavelength of 372 nm.

#### *3.3. Cell Growth Inhibition Assay*

Finally, cell growth inhibition assays were performed using HeLa cells to verify whether bApt could be used as a drug carrier for irinotecan. Fluorescent staining uses CellTracker™, which can selectively stain only live cells. From Figure 4D,H, nonemissive images were obtained only in the presence of thrombin/CPT1/bApt complex or thrombin/irinotecan/bApt complex after 48 h incubation. Furthermore, the fibrin gels containing bApt are not cytotoxic [30]. Thus, it was demonstrated that a significant inhibition of cell proliferation occurs in irinotecan and CPT1 under the SOEF phenomenon.

**Figure 4.** Fluorescence microscopy images of the HeLa cells incubated for 48 h at 37 ◦C after addition of (**A**,**E**) fibrinogen and thrombin, (**B**,**F**) fibrinogen, thrombin and CPT1 or irinotecan, (**C**,**G**) fibrinogen, CPT1 or irinotecan, and bApt, and (**D**,**H**) fibrinogen, thrombin, CPT1 or irinotecan, and bApt. The thrombin, CPT1 or irinotecan, and bApt concentrations are 100 nM each and fibrinogen concentration is 1000 nM. Scale bar = 50 µm.

#### **4. Discussion**

Fluorescence polarization measurements showed that the CMA part of bApt binds to irinotecan, and further, the irinotecan-binding bApt is efficiently incorporated into fibrin gels by binding the TBA part of bApt to thrombin. Furthermore, cell assays revealed that fibrin gels densely containing irinotecan can cause cancer cell death.

In this study, 29-mer TBA was used at the TBA part of bApt known to have a *K*<sup>d</sup> value of 0.5 nM for thrombin. This binding to thrombin can delay the blood coagulation reaction by about 16 s compared to the normal condition without any inhibitors [24]. Meanwhile, in the presence of TBAs with different sequences of similar lengths (15–38 mer) with *K*<sup>d</sup> values of 0.7–126 nM for thrombin, their blood coagulation reactions were delayed by about 2 s [24]. Therefore, the 29-mer TBA sequence used in this study has a relatively high affinity for thrombin and exhibits inhibitory activity although it does not completely inhibit the activity of thrombin.

Currently, phenomena in which thrombin bound to iron oxide nanoparticles is incorporated into fibrin gels were observed with electron microscopes [31]. Those reports agree with the efficient introduction of bApt into fibrin gels due to the high avidity of the TBA to thrombin and the efficient thrombin-mediated uptake of TBA into fibrin gels observed in this study [26,30].

Moreover, in this study, irinotecan was loaded at the CMA site of bApt instead of CPT1. Recently, we have demonstrated that using bApt carrying CPT1, CPT1 was concentrated in a thin layer of about 50 µm thickness formed by fibrin gels, and thereby the growth inhibitory effect on cancer cells appeared more significantly about 182 times higher than that of CPT1 only [30]. The inhibitory concentration (IC50) values of CPT1 entrapped in fibrin gel (7.6 ± 0.36 nM) were approximately 170 times lower than that of CPT1 only in the culture medium (1300 ± 600 nM) [30]. Similarly, in this study, irinotecan was concentrated in a thin layer formed by fibrin gel and assimilated into the cell, resulting in a higher cancer cell growth inhibitory effect when compared with that obtained only using irinotecan. The IC<sup>50</sup> values of irinotecan for HeLa cell were 1320 ± 130 nM [2,32]. Therefore, the IC<sup>50</sup> values of irinotecan entrapped in fibrin gel were estimated to be approximately several nM to several tens of nM.

For the application of bifunctional molecular recognition biomolecules to pharmaceuticals, an antibody (emicizumab) with two target recognition sites is currently approved for treating hemophilia [33,34]. Conversely, for using a nucleic acid aptamer, it has been reported that

aptamer–aptamer conjugates enable signal control of tyrosine kinase receptor [35] or delivery of doxorubicin to the tumor cell/brain lesions [36], and aptamer–antibody or aptamer–aptamer conjugate can attract T and cancer cells and destroy cancer cells [37–39]; however, they have not been practically used [35–40].

The bApt oligonucleotide has the activity of binding to both the anticancer drug and thrombin, and acts as a carrier for anticancer drugs, which enables the concentration of the anticancer drug in fibrin gels around growing cancer cells. In the future, in the design of bApt, the anticancer drug binding site is compatible with other functionalities. Therefore, the concept and methodology may also be applicable to diseases related to blood coagulation in the future.

#### **5. Conclusions**

We demonstrated that the approved drug irinotecan can be selectively and efficiently concentrated into fibrin gels. In the future, the bApt we have developed can be applied as a drug carrier to DDS of anticancer agents around cancer tissues where fibrin production is active. Furthermore, creating drug carriers using a combination of approved drugs and their aptamers and their application to DDS are expected to be applied to diseases related to fibrin, such as blood coagulation disorders.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2076-3417/10/23/8755/s1, Figure S1: Metabolism of irinotecan to SN-38, Figure S2: Enzymatic synthesis of bApt through a primer extension reaction, Table S1: Synthetic oligonucleotides used in this study.

**Author Contributions:** Conceptualization, M.K.; methodology, M.K.; validation, H.F. and Y.K.; formal analysis, H.F. and Y.K.; investigation, H.F. and Y.K.; data curation, H.F. and Y.K.; writing—original draft preparation, H.F. and Y.K.; writing—review and editing, H.F., Y.K. and M.K.; supervision, M.K.; project administration, M.K.; funding acquisition, M.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This paper was supported by a Grant for Translational Research Program (H355TS) from the Japan Agency for Medical Research and Development (AMED), and Nihon University Multidisciplinary Research Grant for 2020.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **E**ff**ective RNA Regulation by Combination of Multiple Programmable RNA-Binding Proteins**

#### **Misaki Sugimoto, Akiyo Suda, Shiroh Futaki and Miki Imanishi \***

Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan; sugimoto.misaki.25c@st.kyoto-u.ac.jp (M.S.); suda.akiyo.3v@kyoto-u.ac.jp (A.S.); futaki@scl.kyoto-u.ac.jp (S.F.)

**\*** Correspondence: imiki@scl.kyoto-u.ac.jp; Tel.: +81-774-38-3212

Received: 7 September 2020; Accepted: 25 September 2020; Published: 28 September 2020 -

#### **Featured Application: RNA targeting.**

**Abstract:** RNAs play important roles in gene expression through translation and RNA splicing. Regulation of specific RNAs is useful to understand and manipulate specific transcripts. Pumilio and fem-3 mRNA-binding factor (PUF) proteins, programmable RNA-binding proteins, are promising tools for regulating specific RNAs by fusing them with various functional domains. The key question is: How can PUF-based molecular tools efficiently regulate RNA functions? Here, we show that the combination of multiple PUF proteins, compared to using a single PUF protein, targeting independent RNA sequences at the 3′ untranslated region (UTR) of a target transcript caused cooperative effects to regulate the function of the target RNA by luciferase reporter assays. It is worth noting that a higher efficacy was achieved with smaller amounts of each PUF expression vector introduced into the cells compared to using a single PUF protein. This strategy not only efficiently regulates target RNA functions but would also be effective in reducing off-target effects due to the low doses of each expression vector.

**Keywords:** RNA binding protein; PUF; RNA regulation

#### **1. Introduction**

Manipulation of specific RNAs is important for elucidating the roles of specific RNAs and for therapeutic purposes as represented by RNA interference and antisense strategies targeting specific mRNAs. Other than such complementary oligonucleotides, sequence-specific RNA-binding proteins, Pumilio and fem-3 mRNA-binding factor (PUF) proteins and pentatricopeptide repeat (PPR) proteins, and clustered regularly interspaced short palindromic repeats (CRISPR)-based systems, including dCas13 and dCas9-PAMer complexed with guide RNA, have been used for the customized sequence-specific regulation of mRNAs [1–6]. By combining them with various functional domains, such as translational regulation factors [7,8], splicing factors [9–11], RNA editing enzymes [12], and RNA (de)methylation enzymes [13–18], targeted control of RNAs can be achieved. The RNA-binding domains of PUF proteins consist of eight highly homologous structural repeats (repeats 1 to 8), flanked by N- and C-terminal repeats (repeats N and C) [19] (Figure 1a). Each repeat recognizes an RNA base through direct interaction between RNA bases and side chains of amino acids at the 12th and 16th positions [20] (Figure 1b). By changing the amino acid pairs of a PUF repeat, artificial RNA-binding domains corresponding to various 8-nucleotide (nt) RNA sequences can be designed [9,21,22]. As the length of RNA recognized by PUFs (8-nt) is not enough to specify the target RNAs within a huge transcriptome, efforts to increase the length of the RNA-binding sequences have been applied by increasing the number of PUF repeats [21,23–26]. Previous reports also showed the assembly of two PUF proteins in adjacent regions on the RNA to image specific

transcripts [27–30]. By contrast, Cao et al. reported that increasing the number of the PUF-binding sequences in reporter mRNAs results in the effective regulation of reporter genes compared to reporter mRNAs having a single PUF-binding sequence [7]. This was also supported through the regulation of endogenous mRNA by PUFs fused to the RNA decay factor, tristetraprolin (TTP). Abil et al. designed artificial PUF proteins, targeting the 3′ untranslated region (UTR) of vascular endothelial growth factor A *(VEGFA)* mRNA. TTP-PUF fusion proteins effectively repressed VEGFA production when the particular 8-nt PUF-binding sequence existed at multiple places within the 3′ UTR of the target mRNA [31]. However, it is unusual that many repeated sequences are at the control region of a single mRNA. Instead of increasing the number of target RNA sequences, we intended to use four different PUF proteins, corresponding target RNA sequences that exist at the control region of a reporter RNA (Figure 1c). In this study, we expressed PUF RNA-binding proteins fused with TTP in mammalian cells and enabled the efficient repression of the reporter protein expression by using multiple TTP-PUF proteins.

**Figure 1.** Programmable RNA binding protein, PUF, and regulation of RNA functions. (**a**) Structure of a PUF protein bound to RNA [PDB ID: 1M8Y] and schematic representation of RNA recognition by a PUF protein. (**b**) 12th and 16th amino acids (A<sup>12</sup> and A16) in each PUF repeat interact with an RNA base. (**c**) Schematic representation of mRNA regulation by multiple artificial PUF proteins fused with TTP. (**d**) Design of PUF proteins, PUF (A)–(D), and their recognition RNA sequences, (a)–(d), used in this research.

#### **2. Materials and Methods**

#### *2.1. Plasmid Construction*

PUF expression vectors for *E. coli* expression were constructed by a Golden Gate Assembly using the PUF Assembly Kit (Addgene #1000000051) and pET28-GG-PUF as the receiving vector [31]. The amino acid sequences at the 12th and 16th positions of each repeat are shown in Figure 1d.

TTP-PUF expression vectors for mammalian cells were constructed by Golden Gate Assembly using the PUF Assembly Kit and pCMV-TTP(C147R)-GG-PUF as the receiving vector.

Luciferase reporter vectors were constructed, as described in the text that follows. The DNA oligonucleotides containing the corresponding sequences for 4 × RNA (a)–(d), or 1 × (a-b-c-d) and their complementary DNA oligonucleotides (Eurofins genomics) were phosphorylated by T4 polynucleotide kinase (NEB) and annealed; they were inserted into the 3′ UTR of the pmirGLO vector (Promega). The sequences of 4 × RNA (a)–(d), or 1 × (a-b-c-d) are shown in Supplementary Figure S1.

#### *2.2. Protein Expression and Purification*

PUF proteins were expressed in *E. coli* BL21 (DE3) cells (Nippon gene) and purified as described before [23]. Briefly, protein expression was induced by adding 0.1 mM isopropyl β-D-1-thiogalactopyranoside (IPTG), and *E. coli* cells were incubated at 18 ◦C overnight. PUF proteins were purified from the *E. coli*-soluble fraction using Ni-NTA HisTrap FF column (GE Healthcare) chromatography followed by ultrafiltration by an Amicon Ultra (10 kDa NMWL) (Merck Millipore).

#### *2.3. Electrophoretic Mobility Shift Assay (EMSA)*

RNA oligonucleotides, labeled with 6-FAM at the 5′ -end, were obtained from FASMAC. Labeled-RNA (10 nM) and three-fold serially diluted concentrations of purified PUF proteins (0–300 nM) were mixed in the reaction buffer containing 10 mM HEPES, 50 mM KCl, 0.5 mM EDTA, 2 mM DTT, 0.01% Tween20, 0.1 mg/mL BSA, 0.02 U/µL RNasin Plus RNase Inhibitor (Promega), and 2.5% Ficoll. The mixture was incubated at 25 ◦C for 30 min. Free RNA and bound RNA were separated via gel electrophoresis using 8% nondenaturing polyacrylamide gels in 0.5× TBE at 4 ◦C. After electrophoresis, fluorescently labeled RNAs were detected using the Typhoon scanner RGB system (GE Healthcare). The peak intensity for bound and free labeled-RNA was measured using the ImageQuant TL software (GE Healthcare). The fractions of bound RNA were plotted against the protein concentrations. For calculating equilibrium dissociation constants (*K*d), the plotted data were fit to the 1:1 binding equation using the Kaleida Graph software (Hulinks) as described before [23].

#### *2.4. Cell Culture*

NIH3T3 cells (RIKEN cell bank; RCB1862) were maintained in Dulbecco's modified Eagle's medium (DMEM) (Fujifilm Wako Pure Chemical) containing 10% FBS.

#### *2.5. Western Blotting Assay*

Cells were lysed using the RIPA buffer, and cell lysates were subjected to SDS-PAGE. The proteins were detected using mouse anti-FLAG antibody (Sigma Aldrich; St. Louis, MO, USA, F3165) or mouse anti-β-actin antibody (Sigma Aldrich; AC-74) as primary antibodies, and Goat anti-mouse IgG antibody (GeneTex) as the Horseradish peroxidase (HRP)-labeled secondary antibody. The bands were visualized using an Amersham ECL Prime (GE Healthcare, Chicago, IL, USA).

#### *2.6. Luciferase Reporter Assay*

Here, 1.5 × 10<sup>4</sup> of NIH3T3 cells were seeded onto 96-well plates the day before transfection. Transfection was performed using Lipofectamine LTX (Thermo Fisher Scientific, Waltham, MA, USA). Transfection mixtures contained 100 ng TTP-PUF expression vector and 1 ng reporter vector. For control, pCMV-TTP(C147R)-GG-PUF (Addgene #1000000051), in which the TTP gene but no PUF-encoding sequences were inserted, was used instead of TTP-PUF expression vectors. Then, 24 h after transfection, cells were lysed using Passive Lysis Buffer (Promega). Firefly luciferase and *Renilla* luciferase activities were measured using the Dual-Luciferase Reporter Assay System (Promega) with measurements taken on a GloMax-Multi Detection System (Promega). Firefly luciferase activity (Fluc) was normalized to *Renilla* luciferase activity (Rluc), and then relative luciferase activities (RLAs), compared to control

samples, were calculated (Equation (1)). The degree of repression by a TTP-PUF relative to the control was indicated by the reciprocal of the relative luciferase activity (Equation (2)).

$$\text{RLA} \,\text{(TTP-PUF)} = \text{[Fluc }\text{(TTP-PUF)}\text{/Rluc }\text{(TTP-PUF)}\text{]} \{\text{[Fluc }\text{(control)}\text{/Rluc }\text{(control)}\text{]}\} \tag{1}$$

$$\text{Fold expression} = 1/\text{RLA} \tag{2}$$

#### **3. Results and Discussion**

#### *3.1. RNA-Binding Properties of Programmable RNA-Binding Proteins*

Four PUF proteins, PUF (A), (B), (C), and (D), targeting different RNA sequences, RNA (a), (b), (c) and (d), respectively, were designed by assembling PUF repeats corresponding to each RNA base in order (Figure 1d). EMSA revealed the high affinity and specificity of PUF proteins to target RNA sequences. In this assay, the binding of PUF proteins reduced the mobility of the target RNAs, allowing the evaluation of RNA-binding abilities of proteins. PUF (A)–(D) proteins were expressed in *E. coli* cells and purified via affinity chromatography using a Ni-NTA column. PUF (A), (B), (C), and (D) showed band-shifts for RNA (a), (b), (c), and (d), respectively (Figure 2, Figure S2a). The apparent dissociation constants between each PUF protein and corresponding RNA were 10.1 nM (PUF (A)/RNA (a)), 5.3 nM (PUF (B)/RNA (b)), <1 nM (PUF (C)/RNA (c)), and 9.1 nM (PUF (D)/RNA (d)) (Table 1, Figure S2b). Although the *K*<sup>d</sup> values were somewhat different between PUFs, they were within the reported range of engineered PUF proteins (10–10–10–8 M) [23,31]. Importantly, all PUF (A)–(D) proteins showed high specificity to their corresponding RNA sequences (Figure S2a). RNA (a) showed clear shift bands only when incubated with PUF (A) but not with the other three PUFs. Similarly, RNA (b)–(d) showed clear shift bands only when incubated with their corresponding PUFs.

**Figure 2.** RNA binding of engineered PUF proteins to target RNAs. In the electrophoretic mobility shift assay (EMSA), RNA oligonucleotides, RNA (a)–(d), were incubated with corresponding PUF proteins, PUF (A)–(D), respectively, at the following concentrations (left to right): 0, 3.7, 11, 33, 100, 300 nM, and the mixtures were electrophoresed in native PAGE. "F" and "B" indicate protein-free RNA and protein-bound RNA, respectively. RNA (a) showed two protein-free bands (shown by \*).


**Table 1.** RNA binding of engineered PUF proteins to specific RNA sequences.

<sup>1</sup> mean ± SD, n.d.; not detected.

#### *3.2. Specific RNA Binding of TTP-PUF within Cells*

The RNA binding specificities of PUF (A)–(D) for their corresponding target RNAs were also confirmed within living cells using luciferase reporter assays (Figure 3, Figure S1a,b). PUF (A)–(D) fused with tristetraprolin (TTP), an mRNA decay factor, were designed as described before [23,31]. As a control, a TTP expression vector that expresses only TTP without fusing with PUFs was used to estimate the effect of PUF [23,31]. Previous reports showed that a TTP-PUF fusion protein, which can bind to the 3′ UTR of the luciferase reporter gene transcript, represses luciferase activity [23,31]. The degree of luciferase repression reflects the RNA binding of TTP-PUF fusion proteins. Here, we prepared reporter vectors that express both firefly luciferase and *Renilla* luciferase genes driven by a PGK and an SV40 promoter, respectively (Figure S1a,b). A four-times repeat sequence of the PUF target sequences, 4 × (a), (b), (c) or (d), was inserted at the 3′ UTR of the firefly luciferase gene (Figure 3a). Each TTP-PUF expression or control vectors and a reporter vector were transfected into NIH3T3 cells. In this reporter system, we expected the firefly luciferase activity to be affected by TTP-PUFs, and *Renilla* luciferase activity to reflect the transfection efficiencies. Thus, the firefly luciferase activity, under the expression of each TTP-PUF, was normalized to the *Renilla* luciferase activity of the same sample. The normalized luciferase activity was compared to that of the control sample to evaluate the degree of repression. The expression of the TTP-PUF proteins was confirmed via Western blotting assay (Figure S1c). As shown in Figure 3b, TTP-PUF (A), (B), (C), and (D) showed a 4–6-fold repression in luciferase gene expression with corresponding RNA-binding sequences at the 3′ UTR. By contrast, they had little effect on luciferase gene expression without corresponding binding sites at the 3′ UTR. None of the TTP-PUFs repressed the luciferase activity in the no binding site (nbs) reporter. These results indicated that the PUF proteins specifically bound to the corresponding RNA sequences at the 3′ UTR of the luciferase gene even within cells.

**Figure 3.** Specific binding of engineered PUF proteins to target RNAs in living cells. (**a**) Schematic representation of TTP-PUF proteins and reporter RNAs. TTP-PUF expression plasmids and luciferase reporter plasmids were co-transfected into NIH3T3 cells. The luciferase reporter genes were transcribed, and reporter RNAs with PUF-binding sequences at the 3′ untranslated region (UTR) region should have been produced in the cells. (**b**) Fold repression of luciferase reporter activities by TTP-PUF proteins. Data are presented as mean ± SD.

#### *3.3. E*ff*ective Repression of Reporter Genes with Not-Repetitive Sequences by Multiple TTP-PUF Proteins*

It is unusual that four or more times sequences exist at the regulatory region of single transcripts. Instead of targeting repeat sequences at the 3′ UTR, the synergistic effect of multiple TTP-PUFs was shown through the targeting of multiple RNA sequences at the 3′ UTR. As a model of transcripts, a reporter vector with a 1 × (a-b-c-d) sequence at the 3′ UTR of the firefly luciferase gene was designed, in which only one binding sequence for each PUF (A)–(D) existed at the 3′ UTR (Figure 3a). In contrast to the 4–6-fold repression of the luciferase genes with 4 × binding sequences, each TTP-PUF (A), (C), or (D) showed only about a 1.5-fold repression (Figure 4, lanes 3,5,6), and TTP-PUF (B) alone did not repress luciferase activity (Figure 4, lane 4), maybe because of the low TTP-PUF (B) expression compared to other TTP-PUF proteins (Figure S1c).

**Figure 4.** Combination of TTP-PUF (A)–(D) with 1 × (a-b-c-d) reporter efficiently repressed the luciferase reporter activities at the 3′ UTR of the luciferase mRNA. NIH3T3 cells were co-transfected with 1 × (a-b-c-d) reporter (1 ng) and 100 ng (total) of the following expression vectors. Lane 1, control vector; lane 2, mixture of TTP-PUF (A), (B), (C), and (D) (25 ng each); lanes 3–6, 100 ng of TTP-PUF (A) (lane 3), (B) (lane 4), (C) (lane 5), (D) (lane 6); lanes 7–10, 75 ng of control vector and 25 ng of TTP-PUF (A) (lane 7), (B) (lane 8), (C) (lane 9), (D) (lane 10). Data are presented as mean ± SD. (\*\*\*, *p* < 0.005; \*\*, *p* < 0.01; \*, *p* < 0.05; n.s. = not significant).

When NIH3T3 cells were co-transfected with all of the expression vectors (TTP-PUF (A)–(D); 25 ng each) and the 1 × (a-b-c-d) reporter vector, significant repression (2.5-fold) of luciferase activity was observed, even though the amount of each expression vector (25 ng each) was one-fourth of that when used alone (100 ng) (Figure 4, lane 2). The mixture of four TTP-PUF expression vectors had little effect on the reporter lacking a PUF-binding sequence (Figure S3). Furthermore, when the cells were transfected with 25 ng of one of the four TTP-PUF vectors, the same amount used for the mixture, none of them showed significant repression of the 1 × (a-b-c-d) reporter (Figure 4, lanes 7–10). These results indicated that the luciferase activity of the 1 × (a-b-c-d) reporter vector was repressed by the sequence-specific binding of TTP-PUF proteins to their binding sites at the 3 ′ UTR. Additionally, the combination of four TTP-PUF proteins, targeting different RNA sequences, caused effective repression of luciferase activity compared to the use of a single TTP-PUF protein. Differences in repression ratios were observed between reporters with 4 × binding sequences by the corresponding single TTP-PUF (4–6-fold) (Figure 3b) and the 1 × (a-b-c-d) reporter by the mixture of TTP-PUF (A)–(D) (2.5-fold) (Figure 4). One plausible reason for the difference is avidity effects for the

4 × proximity binding sites. The overall RNA structure of the PUF binding sequences can affect the binding of TTP-PUFs. Further optimization will increase the synergistic effects of multiple functional PUF proteins.

So far, synergistic gene activation or repression was shown by a combination of multiple artificial transcription regulators based on transcription activator-like effectors (TALEs) and CRISPR-dCas9 [32,33]. The results using a combination of multiple RNA binding proteins are in accordance with these reports targeting DNA.

#### **4. Conclusions**

In this study, four independent PUF proteins were designed and shown to recognize the target RNA sequence with high specificity both in vitro and in cellulo. By combining these four PUF proteins, we showed the use of multiple TTP-PUF proteins in the effective repression of target gene expression, even though the amount of each transfected expression vector was smaller than that of a single PUF protein. Although further verification is needed to understand the off-target effects, we expect that this strategy will reduce off-target effects because of a lower dose of expression vectors. Sequence-specific RNA-binding molecules, like PUFs, can exert desired functions at specific RNA regions when fused with various functional domains. As the importance of RNAs in life science becomes clear, there is an increasing demand for methods to control specific RNAs. Combination of multiple PUF proteins would be a promising strategy to control endogenous RNAs effectively. Furthermore, this concept might be applied to other RNA-targeting molecular tools, such as PPR proteins and CRISPR-Cas13 based systems.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2076-3417/10/19/6803/s1, Figure S1: Plasmids used for luciferase reporter assays and expression of TTP-PUFs, Figure S2: Representative results of EMSA, Figure S3: Effect of the mixture of TTP-PUFs on the nbs reporter.

**Author Contributions:** Conceptualization, M.I. and M.S.; investigation, M.S. and A.S.; writing—M.I. and S.F. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by JSPS/KAKENHI (19H02850 to M.I.).

**Acknowledgments:** We kindly thank Huimin Zhao for plasmids to construct PUFs (Addgene #1000000051).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **An RNA Triangle with Six Ribozyme Units Can Promote a** *Trans***-Splicing Reaction through Trimerization of Unit Ribozyme Dimers**

**Junya Akagi <sup>1</sup> , Takahiro Yamada <sup>1</sup> , Kumi Hidaka <sup>2</sup> , Yoshihiko Fujita <sup>3</sup> , Hirohide Saito <sup>3</sup> , Hiroshi Sugiyama 2,4 , Masayuki Endo 2,4 , Shigeyoshi Matsumura <sup>1</sup> and Yoshiya Ikawa 1,\***



**Citation:** Akagi, J.; Yamada, T.; Hidaka, K.; Fujita, Y.; Saito, H.; Sugiyama, H.; Endo, M.; Matsumura, S.; Ikawa, Y. An RNA Triangle with Six Ribozyme Units Can Promote a *Trans*-Splicing Reaction through Trimerization of Unit Ribozyme Dimers. *Appl. Sci.* **2021**, *11*, 2583. https://doi.org/10.3390/ app11062583

Academic Editors: Tamaki Endoh and Chih-Ching Huang

Received: 29 November 2020 Accepted: 11 March 2021 Published: 14 March 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

**Abstract:** Ribozymes are catalytic RNAs that are attractive platforms for the construction of nanoscale objects with biological functions. We designed a dimeric form of the *Tetrahymena* group I ribozyme as a unit structure in which two ribozymes were connected in a tail-to-tail manner with a linker element. We introduced a kink-turn motif as a bent linker element of the ribozyme dimer to design a closed trimer with a triangular shape. The oligomeric states of the resulting ribozyme dimers (kUrds) were analyzed biochemically and observed directly by atomic force microscopy (AFM). Formation of kUrd oligomers also triggered *trans*-splicing reactions, which could be monitored with a reporter system to yield a fluorescent RNA aptamer as the *trans*-splicing product.

**Keywords:** catalytic RNA; group I ribozyme; RNA nanostructure; RNA nanotechnology; RNAprotein complex; *trans*-splicing

#### **1. Introduction**

4

Folded polypeptides work not only in monomeric states but also in assembled states, including homo- and heterooligomers [1–3]. In the course of enzyme evolution, symmetrical protein homooligomers with polygonal and polyhedral shapes emerged from their monomeric ancestors, presumably due to the advantages in their biological properties over the respective monomeric states, including enzymatic activities and structural stability [1–3]. Interfaces between monomer units play a key role in the assembly of protein oligomers. The artificial design of protein oligomers with symmetrical shapes has been explored with the use of protein–protein interaction interfaces extracted from naturally occurring protein oligomers, generating the field of protein nanostructure research [4–6]. Recent challenges in this field include de novo design of artificial protein–protein interaction interfaces that can serve as modular structural units for versatile protein nanostructure design [7,8]. Among the various forms of symmetrical protein homooligomers, symmetrical protein homodimers can be regarded as the simplest molecular parts for the rational design and assembly of protein structures [9–11].

RNA is a biopolymer that can fold into defined 3D structures with biological functions. Structured RNAs are known to perform many types of protein-like functions, including roles as catalysts and receptors [12–14]. Large and complex 3D RNA structures frequently formed through noncovalent assembly of structural modules, which can fold locally within a single polynucleotide chain [15]. Intramolecular assembly of RNA structural modules is

∆

supported and maintained by RNA–RNA interaction interfaces [15,16]. Their association is usually strong enough to reconstitute the catalytic activity in a multimolecular format, in which separately prepared modular RNAs are assembled noncovalently by RNA–RNA interaction interfaces [17–25].

We recently engineered an RNA–RNA interaction interface of the *Tetrahymena* group I ribozyme to design a prototype of ribozyme modules (abbreviated as RzM-0, Figure 1) with self-dimerizing capability [26,27]. The RNA–RNA interface in the RzM-0 homodimer is supported by two sets of tertiary interactions between the ∆P5 core ribozyme module and P5abc module, which are connected by flexible linker elements (P5-P5a linker) in the parent ribozyme and form strong intramolecular associations (WT group I ribozyme, Figure 1). We rationally replaced the flexible P5-P5a with a rigid duplex to prevent intramolecular ∆P5- P5abc assembly (Figure S1C), resulting in homodimerization of the resulting variant (RzM-0 in Figure 1) in a head-to-head manner by forming RzM:RzM interface **0:0** [26,27]. A pair of symmetrical ∆P5-P5abc interfaces yielding RzM:RzM homodimer was then engineered to asymmetrical recognition interfaces, such as RzM:RzM interface **1:1**′ , yielding RzM:RzM heterodimers (Figure 1). We then connected a pair of RzM monomers in a tail-to-tail manner at their P6b region, enabling the resulting covalent dimer (termed unit ribozyme dimer and abbreviated as Urd) to homooligomerize in a directional manner (Figure 1) [28]. Unit ribozyme dimers (Urds) homooligomerized to form an open chain assembly (Figure 1). The extents of oligomerization were programmable by designing the specificity of the assembly interfaces between two ribozyme modules (RzMs) in Urds [28]. We also expected that the parent Urds could be modified to produce closed forms of symmetrical oligomers if we used a bent P6b-P6b linker. In this study, we constructed and examined a new series of Urds, which were designed to form closed trimers with a triangular shape. ∆ ∆ **′**

**Figure 1.** Design of closed oligomers based on engineered group I ribozyme dimers. Scheme of modular redesign to generate unit ribozyme dimers (kUrds) and their oligomerization. White arrows indicate structural redesign to prepare RzM RNAs and their noncovalent dimerization as reported by Tanaka et al. in 2016 [26]. Gray arrows indicate rational redesign for covalent dimerization of RzM RNAs reported by Kiyooka et al. in 2020 [28]. Yellow arrows indicate structural redesign to construct kUrd **1-K-1**′ in this study. Sequences and secondary structures of wild-type ribozyme, M1 variant ribozyme and their structural elements are shown in Figure S1.

#### **2. Materials and Methods**

#### *2.1. Molecular Design*

Three-dimensional structural models of kink-turn unit ribozyme dimer (abbreviated as kUrd) **1-K-1**′ and its closed homotrimer were constructed using the crystal structure of a shortened form of the *Tetrahymena* group I ribozyme (PDB ID: 1X8W) [29], a model 3D structure of its full-length form [27] and the KT-15 kink-turn motif (PDB ID: 1JJ2) [30]. Molecular modeling was performed using Discovery Studio (BIOVIA, San Diego, CA, USA), protocol of which has been described with an example to construct a 3D model structure of the RzM:RzM dimer [27]. In this study, the modeling was continued to connect three RzM:RzM dimers and three KT-15 motifs by pairs of A-form RNA duplexes (15 bp/10 bp).

#### *2.2. Plasmid Construction and RNA Preparation*

Plasmids encoding the sequences of α- and β-chains of kUrd RNAs were derived in two steps from plasmids encoding chimeric constructs of the wild-type *Tetrahymena* group I intron and its M1 variant-type intron [31]. PCR-based mutagenesis was used in each step of the stepwise plasmid construction. We first changed the sequences of P5-P5a linker elements of the starting plasmids to form the rigid P5-P5a duplex to yield RzM:RzM interfaces. The resulting plasmids were further modified by replacing the sequences of P6b elements to introduce the KT-15 kink-turn as the bent linker element. The resulting plasmids were used as templates for PCR. For each PCR, the sense primer contained the T7 promoter sequence. PCR-amplified DNA templates were used for synthesis of α- and β-chains of kUrd RNAs by in vitro transcription with T7 RNA polymerase. Transcription reactions were performed for 4.5 h at 37 ◦C in the presence of nucleotide triphosphates (1 mM each), 15 mM Mg2+, 40 mM Tris-Cl (pH 7.5), 10 mM dithiothreitol and 2 mM spermidine. The DNA template in the reaction mixture was removed by DNase treatment for 30 min. The transcribed RNA was purified on 4% denaturing polyacrylamide gels. 3′ - End labeling of RNAs with BODIPY fluorophore was performed according to the published protocol in which the diol moiety in the 3′ end ribose was oxidized by NaIO<sup>4</sup> to produce two aldehyde moieties, which were then connected with a primary amino group in the BODIPY derivative in the presence of NaBH3CN [32].

#### *2.3. Preparation of L7Ae Protein and L7Ae–EGFP Fusion Protein*

L7Ae protein and L7Ae–EGFP fusion protein were prepared according to the published protocol [33,34]. Briefly, the pET 28-b vector was used for cloning and construction of the recombinant protein L7Ae from *Archaeoglobus fulgidus* and its EGFP fusion protein. *Escherichia coli* BL-21 (DE3) (pLysS) cells were then used for protein production.

#### *2.4. Electrophoretic Mobility Shift Assay (EMSA) of kUrd Oligomers*

To analyze homooligomerization of kUrd **1-K-1**′ , aqueous solution (18 µL) containing its α- and β-chain RNAs (0.14 µM) was heated at 80 ◦C for 2.5 min and then cooled to 37 ◦C. The 10× concentrated buffer (2 µL) containing 300 mM Tris-Cl (pH 7.5) and 100–200 mM Mg(OAc)<sup>2</sup> was added to the RNA solution. The resulting solution contained 30 mM Tris-Cl (pH 7.5), 10–20 mM Mg(OAc)<sup>2</sup> and 0.125 µM each α- and β-chain RNA to form 0.125 µM **1-K-1**′ . To analyze heterotrimers, three kUrd solutions (6.67 µL each), each of which contained 0.375 µM each α- and β-chain RNA, were prepared separately and their mixed solution (20 µL) was then prepared. To analyze heterodimers, two kUrd solutions (10 µL each), each of which contained 0.25 µM each α- and β-chain RNA, were prepared separately and their mixed solution (20 µL) was then prepared. The resulting solution (20 µL) containing kUrd homo- or heterooligomers was incubated at 37 ◦C for 30 min and then 4 ◦C for 30 min and 6× loading buffer containing 50% glycerol and 0.1% xylene cyanol (4 µL) were added. The samples (24 µL) were loaded onto a 5% non-denaturing polyacrylamide gel (29:1 acrylamide:bisacrylamide) containing 50 mM Tris-acetate (pH 7.5) and 5–25 mM Mg(OAc)2. Electrophoresis was performed at 4 ◦C, 200 V for the initial 5 min, followed by 75 V for 5 or 12 h. Gels were analyzed using a Pharos FX fluoroimager (BioRad,

Hercules, CA, USA). Each RzM:RzM interface consists of a pair of ∆P5/P5abc interfaces. The binding affinity between ∆P5 core module and P5abc module varied considerably depending on the identity of the ∆P5/P5abc interface [35,36]. Stability of the interface is improved by increasing Mg2+ concentration [35,36]. In each EMSA, we therefore tuned the concentration of Mg2+ .

#### *2.5. Atomic Force Microscopy*

Atomic force microscopy (AFM) was performed on a high-speed AFM (Nano Live Vision; RIBM, Tsukuba, Japan) according to the protocol reported previously [37]. The sample solutions containing kUrd RNAs were diluted to a final concentration of ~80 nM in folding buffer containing 17.5 mM Mg2+ and 2 µL of the sample was deposited onto the mica surface of the AFM.

#### *2.6. Electrophoretic Mobility Shift Assay (EMSA) of the kUrd Trimer with L7Ae Protein*

Three kUrd solutions (4.1 µL each), each of which contained 30 mM Tris-Cl (pH 7.5), 17.5 mM Mg(OAc)<sup>2</sup> and 0.61 µM each α- and β-chain RNA, were prepared separately according to the protocol for kUrd oligomer formation. To each kUrd solution (4.1 µL) was added 5× concentrated binding buffer (1.33 µL) containing 50 mM HEPES-KOH (pH 7.5), 750 mM KCl and 7.5 mM MgCl2. To this solution, was added a solution (1.25 µL) of L7Ae (or L7Ae–EGFP) protein with appropriate concentration and the resulting solution (6.68 µL) was incubated at 37 ◦C for 30 min. Three kUrd+L7Ae solutions were then mixed. The resulting solution (20 µL) containing three kUrds (0.125 µM each) and L7Ae was incubated at 37 ◦C for 30 min and then 4 ◦C for 30 min and 6× loading buffer containing 50% glycerol and 0.1% xylene cyanol (4 µL) was added. The samples (24 µL) were analyzed in a same manner to EMSA of kUrd oligomers.

#### *2.7. Substrate Cleavage Reaction*

To prepare a set of kUrd monomer solutions, aqueous solutions containing appropriate pairs of α- and β-chain RNAs (final concentration of each kUrd: 0.25 µM) were heated at 80 ◦C for 5 min and then cooled to 37 ◦C. Then, 10× concentrated reaction buffer (final concentration: 30 mM Tris-Cl, pH 7.5 and 3 mM or 5 mM MgCl2) and 2 mM guanosine triphosphate (final concentration: 0.2 mM) were added to the RNA solution and incubated at 37 ◦C for 30 min to form a given kUrd monomer. The resulting kUrd monomer solutions were then mixed to afford a kUrd oligomer solution, which was additionally incubated at 37 ◦C for 30 min. Ribozyme reactions were started by adding substrate-a (5′ -FAM-GGCCCUCUAAAAA-3′ , final concentration: 0.25 µM) conjugated with carboxyfluorescein (FAM) at its 5′ end [37]. The resulting solution containing 0.25 µM each kUrd and 0.25 µM substrate-a was kept at 37 ◦C. Aliquots were taken at given time points and the mixtures were electrophoresed on 15% denaturing polyacrylamide gels. The substrate and reaction products were analyzed using a Pharos FX fluoroimager. All of the ribozyme activity assays were repeated at least twice and representative gel images are shown.

#### *2.8. Trans-Splicing Monitored by Fluorescence of Spinach RNA–DFHBI*

To prepare three kUrd monomer solutions, aqueous solutions (17 µL) containing appropriate pairs of α- and β-chain RNAs (final concentration of each kUrd: 0.25 µM) were heated at 80 ◦C for 5 min and then cooled to 37 ◦C. Then, 10× concentrated reaction buffer (2 µL, final concentration: 40 mM Tris-Cl, pH 7.5, 125 mM KCl, 5 mM MgCl<sup>2</sup> and 10 µM DFHBI) and 2 mM guanosine triphosphate (1 µL, final concentration: 0.2 mM) were added to the RNA solution and incubated at 37 ◦C for 30 min to form a given kUrd monomer. The resulting kUrd monomer solutions were then mixed to afford a kUrd oligomer solution (60 µL) containing 0.25 µM each kUrd and initiate the *trans*-splicing reaction. The solution was transferred quickly to the wells of a microplate reader with a mineral oil overlay. The plate was then incubated at 37 ◦C in a plate reader (Infinite F200 Pro; Tecan, Männedorf, Switzerland), which was pre-warmed at 37 ◦C and recorded fluorescence intensity of the sample solution.

**′ ′ ′ ′ ′**

**′ ′ ′ ′ ′**

**′**

#### **3. Results**

#### *3.1. Design of Unit Ribozyme Dimers (Urds) with a Bent Linker*

To design a new series of Urds to form closed trimers, we needed a suitable RNA motif consisting of two helices with fixed (~60◦ ) angles as a bent linker unit. We selected the kinkturn RNA motif (Figure 1 and Figure S2) [30], which is one of the most extensively studied RNA motif families and provides a sharp (~60◦ ) bend to the helical axis (Figure 2A). Kinkturn motifs have been identified in a diverse range of naturally occurring RNA structures with various functions [30]. Here, we chose the KT-15 kink-turn motif (Figure 2A), which not only serves as an RNA structural element but also acts as a recognition element for the ribosomal protein, L7Ae (Figure 2A) [30]. This RNA–protein complex has also been used as a module in synthetic RNA–protein nanostructures [33,34,38,39] and also synthetic RNA switches [40–43]. **′**

**′ Figure 2.** Structures of kUrds and their assembly analyzed by EMSA. (**A**) 3D structures of KT-15 kink-turn motif (orange) with a pair of duplexes (green) and its binding with L7Ae protein (blue). Sequence and secondary structures of the linker elements of kUrds with three distinct pairs of duplexes. Small red and green icons indicate RNA motifs. Mutations in L2 elements to disrupt RzM:RzM interaction are also shown as elliptic gray icons. Sequences and secondary structures of these RNA motifs are shown in Figure S1. (**B**) EMSA of **1-K-1**′ homopolymers formed in the presence of 15 mM Mg2+. Asterisks indicate RNA chains labeled with BODIPY fluorophore. (**C**) EMSA of **1-K-1**′ homopolymers formed in the presence of 20 mM Mg2+. Asterisks indicate RNA chains labeled with BODIPY fluorophore.

#### *3.2. Homooligomerization of kUrds*

A new class of Urd variants bearing the KT-15 motif was designated as kink-turn unit ribozyme dimer (abbreviated as kUrd, Figure 1). To optimize kUrd linkers, we compared three linker elements, each of which had the KT-15 motif flanked by a pair of duplexes with 15 bp/10 bp, 15 bp/9 bp, or 14 bp/9 bp (Figure 2A). The three distinct kUrd linkers were used to connect two RzM monomers (RzM-1 and RzM-1′ ) in a tail-to-tail manner, yielding three **1-K-1**′ RNAs (Figure 2A and Figure S2). The three **1-K-1**′ were designed to homotrimerize to form closed trimers (Figure 1). For each **1-K-1**′ , we prepared **1x-K-1**′ and **1-K-1x**′ to eliminate assembly ability of the RzM on one side by introducing a UUCG tetraloop [44,45] into the L2 element (where **x** designates L2 UUCG, see Figures S1D and S3). We also prepared **1x-K-1x**′ mutant for each **1-K-1**′ to eliminate assembly ability of RzMs on both sides (Figure S3). We analyzed three **1-K-1**′ kUrds with distinct pairs of helices by electrophoretic mobility shift assay (EMSA) to characterize their assembly properties. Each **1-K-1**′ kUrd was composed of two RNA strands (α- and β-RNA chains) (Figures S2B and S3, Table S1). Preliminary experiment showed that the two RNA strands assembled efficiently to form the kUrd monomer (Figure S4). We first labeled the 3′ -end of the β-chain RNA with the BODIPY fluorophore [32] to visualize kUrd and its assembly in native gels in the presence of 10 mM Mg2+ (Figure S5A), 15 mM Mg2+ (Figure 2B) and 20 mM Mg2+ (Figure 2C).

BODIPY-labeled β-chain RNA of **1x-K-1x**′ , which can also be used as β-chain RNA of **1-K-1x**′ , showed broad and multiple bands (lanes 1, 5 and 9 in Figure 2B,C). The β-chain RNA, however, showed a sharp and retarded band in the presence of an equimolar amount of the partner α-chain RNA (lanes 2, 6 and 10 in Figure 2B,C), indicating assembly of the α- and β-chains RNAs to form a structured kUrd **1x-K-1x**′ monomer. Introduction of two L2-UUCG loops into both of the RzMs in kUrd afforded **1x-K-1x**′ to selectively form a monomeric state (lanes 2, 6 and 10 in Figure 2B,C). We then confirmed the assembly state of **1-K-1**′ and its mutants by preparing a solution containing equal amounts of four kUrds (**1x-K-1x**′ , **1x-K-1**′ , **1-K-1x**′ and **1-K-1**′ ), prepared by mixing two α-chain RNAs and two β-chain RNAs in a single tube (lanes 3, 7 and 11 in Figure 2B,C). BODIPY-labeled β-chain RNAs for **1x-K-1x**′ and **1-K-1x**′ selectively visualized kUrd monomer (**1x-K-1x**′ ), kUrd dimer (**1x-K-1**′ **:1-K-1x**′ ) and kUrd open trimer (**1x-K-1**′ **:1-K-1**′ **:1-K-1x**′ ). In addition, to kUrd monomers, two new bands with lower mobilities were observed (lanes 3, 7 and 11 in Figure 2B,C), which were expected to be a kUrd dimer consisting mainly of **1x-K-1**′ **:1-K-1x**′ and a kUrd open trimer consisting mainly of **1x-K-1**′ **:1-K-1**′ **:1-K-1x**′ .

We then analyzed the solution containing only **1-K-1**′ (lanes 4, 8 and 12 in Figure 2B,C), which was expected to form homooligomers, including a closed form homotrimer. The dominant band was retained in the gel slot, suggesting the formation of high-molecular weight oligomers. Two new bands, which were not seen in lanes 3, 7 and 11, were also observed. The lower mobility band would correspond to the desired closed trimer containing six ribozyme units (lanes 4, 8 and 12 in Figure 2B,C). The higher mobility band could be a closed dimer of Urds, which was not predicted in molecular modeling of **1-K-1**′ . In the presence of 10 mM Mg2+, the unit kUrd appeared to form efficiently (lanes 2, 6 and 10 in Figure S5A, see also Figure S5B,C) but oligomeric states of kUrds were formed less efficiently than those with 15 mM Mg2+ (Figure S5B,C). Formation of oligomeric states of kUrds seemed enhanced in the presence of 20 mM Mg2+ compared to 15 mM Mg2+ (Figure S5B). Among the three kUrds with distinct pairs of duplexes (15 bp/10 bp, 15 bp/9 bp and 14bp/9 bp) in the kink-turn linker elements, the 15 bp/10 bp and 15 bp/9 bp linkers worked better than the 14 bp/9 bp linker, which produced multiple bands in the presence of 20 mM Mg2+ (Figure 2C). In the following experiments, we used the 15 bp/10 bp pair as the kink-turn linker element.

#### *3.3. Block Copolymerization of kUrds*

Based on the homooligomerization of **1-K-1**′ forming the desired closed trimer and unexpected closed dimer, we then examined selective formation of a closed trimer and

a closed dimer through copolymerization of two or three different kUrds (Figure 1). We substituted two RzMs in **1-K-1**′ to prepare a trio of Urds (**3-K-2**′ , **2-K-4**′ and **4-K-3**′ ) for triblock-type kUrd copolymer. For this purpose, we introduced three RzM:RzM interactions **2:2**′ , **3:3**′ and **4:4**′ (Figure 3A and Figure S1D), which were designed to be orthogonal to one another in their recognition specificity [28]. In a similar manner, a pair of Urds (**1-K-5**′ and **5-K-1**′ ) were also designed for diblock-type kUrd copolymer (Figure 3A and Figure S1D), which would selectively yield a closed dimer. In block copolymerization by two or three distinct kUrds, we first prepared each kUrd separately by assembling its α- and β-RNA chain RNAs (Figures S2B and S4). Solutions of preformed kUrds were then mixed to form block copolymers of kUrds.

**Figure 3.** Assembly of a trio of kUrds to form a closed heterotrimer and a pair of kUrds to form a closed heterodimer. (**A**) Five distinct pairs of RzM:RzM interfaces, three of which were used in a closed heterotrimer configuration and the remaining two of which were used in a closed heterodimer configuration. Sequences and secondary structures of RNA motifs specifying RzM:RzM interfaces are shown in Figure S3. (**B**) EMSA of triblock and diblock kUrd copolymers formed in the presence of 20 mM Mg2+. Asterisks indicate RNA chains labeled with BODIPY fluorophore.

**′**

**′ ′ ′ ′ ′**

α β

We first examined copolymerization of **3-K-2**′ , **2-K-4**′ and **4-K-3**′ to obtain the selective formation of a closed trimer (Figure 3B). To distinguish the closed trimer from the open trimer, we also examined **4-K-3x**′ , with which ring closure would be blocked by disruption of the **3:3**′ interface. As seen in lane 4 in Figure 3B, assembly of three kUrds provided a migrating band the mobility of which was close to that of the band corresponding to the possible closed trimer in homopolymerization of **1-K-1**′ (lane 7 in Figure 3B). Higher oligomerization of kUrds seen in **1-K-1**′ homopolymers (lane 7 in Figure 3B) was still seen in triblock copolymers (lane 4 in Figure 3B). In the presence of **4-K-3x**′ in place of **4-K-3**′ , no band was retained in the sample well (lane 5 in Figure 3B), consistent with the molecular design to form the open block trimer (**3-K-2**′ **:2-K-4**′ **:4-K-3x**′ ) lacking further oligomerization ability. Selective formation of the open form of kUrd trimer was suggested by the single migrating band (lane 5 in Figure 3B), which migrated slower than the open kUrd dimer (**4x-K-3:3-K-2x**′ , see lane 3 in Figure 3B) but faster than the possible closed trimers (lanes 5 and 7 in Figure 3B). Importantly, in the trio of kUrd designed for triblock copolymer (lane 4 in Figure 3B), no band corresponding to the possible closed dimer of **1-K-1**′ (see lane 7 in Figure 3B) was observed. Bands corresponding to the open dimer and monomer were not seen in lane 4 (Figure 3B), suggesting that triblock oligomerization proceeded efficiently. Comparison between two trios of kUrds with/without ring closure ability (lanes 4 and 5 in Figure 3B) suggested that the closed trimer is thermodynamically stable and the dissociation of RzM interfaces in the closed trimer occurred poorly.

The pair of kUrds for diblock copolymer also formed oligomers with no migration (lane 6 in Figure 3B). This mixture also yielded a band the mobility of which corresponded to the possible closed dimer of **1-K-1**′ (lanes 6 and 7 in Figure 3B). No band was seen in lane 6 at the position corresponding to that of the possible closed trimer. In the closed heterotrimer consisting of three distinct kUrds, each kUrd was formed by α- and β-chain RNAs. As the kUrd closed heterotrimer contained six distinct RNA chains, we labeled one of six RNA chains with BODIPY fluorophore and monitored the mobility of the closed trimer on native gels (Figure S6). Regardless of the identity of the BODIPY-labeled RNA chain, all EMSA showed similar behavior, supporting the formation of the closed heterotrimer by assembly of three kUrds.

#### *3.4. Atomic Force Microscopy Analysis of kUrd Heterotrimer and Heterodimer*

Based on the observation that three kUrds are likely to form closed trimers in both homopolymerization and triblock copolymerization, we visualized the molecular structures of the heterotrimer (**3-K-2**′ **:2-K-4**′ **:4-K-3**′ ) and heterodimer (**1-K-5**′ **:5-K-1**′ ) by atomic force microscopy (AFM) [33,37–39]. In the presence of 17.5 mM Mg2+, a sample containing three kUrds gave AFM images containing objects with triangular shapes (Figure 4A,B). The triangular objects were observed with limited frequency. We measured lengths of 21 sides of seven triangular objects observed in AFM images. While their lengths varied between 28.1 nm 39.6 nm, their average value was 34.5 nm. This value was closely similar to that predicted by molecular modeling of the closed trimer (approximately 34 nm, Figure 1). In some of the AFM images of kUrd triangles, we observed triangular shapes as three bright regions (Figure 4B), heights of which were 6~8 nm (Figure S7A).

We then analyzed AFM images of a solution containing **1-K-5**′ and **5-K-1**′ that formed a closed heterodimer, **1-K-5**′ **:5-K-1**′ (Figure 4C). Due to the limited efficacy of the closed dimer formation (Figure 3B), AFM provided the corresponding images inefficiently. A few RNA objects, however, were found to reflect closed dimers (Figure 4C), in which two bright spots were observed in each object. Heights of bright spots were 5~7 nm (Figure S7B). We also analyzed AFM images of two mutant kUrds with disrupted L2 interacting motifs (Figure S8). They retained some of RNA interacting motifs but were not able to form kUrd dimers (Figure S8A). The two mutant kUrds formed some aggregates on mica surfaces but gave no bright spots in the presence of 17.5 mM Mg2+ (Figure S8C,D).

**′ ′**

**Figure 4.** AFM imaging of closed kUrd heterotrimers and closed heterodimers. (**A**,**B**) AFM of closed kUrd trimers in the presence of 17.5 mM Mg2+. Yellow arrows indicate closed kUrd trimers. (**C**) AFM of closed kUrd dimers in the presence of 17.5 mM Mg2+. Yellow arrows indicate closed kUrd dimers.

**′ ′**

#### *3.5. Decoration of the kUrd Closed Trimer with Kink-Turn Binding Protein L7Ae*

We then decorated the kUrd closed trimer with the RNA binding protein, L7Ae, for which the KT-15 kink-turn motif serves as a binding element (Figure 2A) [30]. We performed EMSA of the heterotrimer (**3-K-2**′ **:2-K-4**′ **:4-K-3**′ ) in the absence and presence of L7Ae protein. In the presence of each RNA chain at 0.125 µM (2.5 pmol each in 20 µL of solution), theoretical amounts of the KT-15 motif in the resulting three distinct kUrd units (7.5 pmol in 20 µL of solution) was also 0.375 µM (7.5 pmol in 20 µL of solution).

**′ ′ ′** μ μ μ μ μ μ μ μ In the presence of 0.375 µM L7Ae protein, slight but distinct retardation of the closed trimer was observed (lane 6 in Figure 5A). A twofold molar excess amount of L7Ae protein (0.75 µM) over the theoretical amount of kink-turn motif (0.375 µM) did not cause further retardation of the band (lane 7 in Figure 5A). The mobility of the trimer band did not change in the presence of a greater excess of L7Ae protein (Figure S9). These observations suggested that the retarded band corresponded to the RNA–protein complex consisting of the kUrd closed trimer possessing three KT-15 kink-turns and three L7Ae protein molecules. In the kUrd trimer–L7Ae complex, fluorescent visualization of L7Ae proteins was also examined using an engineered L7Ae protein fused to enhanced green fluorescent protein (EGFP) (Figure 5B) [46].

**Figure 5.** Complex formation of L7Ae protein with KT-15 kink-turn motifs in a closed kUrd heterotrimer. (**A**) EMSA of a closed kUrd heterotrimer in the absence and presence of L7Ae protein. (**B**) EMSA of a closed kUrd heterotrimer in the absence and presence of L7Ae protein and L7Ae-EGFP fusion protein.

> ∆ ∆ ∆ ∆ L7Ae–EGFP fusion protein, which showed a broad band in free form (lane 4 in Figure 5B), shifted significantly in the presence of the kUrd heterotrimer (lane 3 in Figure 5B). In lane 3, the major band migrated more slowly than the free kUrd heterotrimer (lane 1) and also more slowly than the complex of closed trimer and L7Ae protein (lane 2). The major band in lane 3 corresponded to a complex of the closed trimer and L7Ae–EGFP fusion protein. The faster migrating band in lane 3 may correspond to an RNA–protein complex containing an open trimer or an open dimer because it was observed very weakly as a byproduct in assembly of the closed trimer (lane 4 in Figure S9B).

#### *3.6. Catalytic Activities of kUrds and Their Oligomers*

∆ ∆ ∆ ′ ′ ′ ∆ Although closed trimers and closed dimers were not formed exclusively without higher oligomers, it was interesting to see the catalytic abilities of ribozyme units in kUrds because their assembly would produce an active ribozyme unit consisting of the ∆P5 core ribozyme module and P5abc activator module (Figures 1 and 6A) [18].

∆ ∆ α β As the ∆P5 core ribozyme can be activated by transient (weak) association of the P5abc module, activity assay required a lower Mg2+ concentration than EMSA [36]. In the substrate cleavage reactions by the wild-type ∆P5 module (Figure S1A) and M1 type ∆P5 module (Figure S1B), the efficiencies of P5abc-depedent activation were optimum in the presence of 3 mM Mg2+ and 5 mM Mg2+, respectively. In the closed heterotrimer, six ∆P5-P5abc ribozyme units were formed, in which three had the wild-type ∆P5 modules (units 2, 3 and 4) with the remaining three having the M1 type ∆P5 module (units 2′ , 3′ and 4 ′ ) (Figure 6A). We examined activation of the ∆P5 core ribozyme by the P5abc module upon formation of kUrd oligomers including the closed trimers.

′ ∆ ∆ ∆

∆

∆

∆

′ ′

∆

′ ′ ′ **Figure 6.** Activation of ribozyme unit upon assembly of kUrds. (**A**) Formation of two ribozyme units in each RzM:RzM interface. One ribozyme unit (2, 3, or 4) formed the wild-type ribozyme and the other ribozyme unit (2′ , 3′ , or 4′ ) formed the M1 type ribozyme. (**B**) Sequences of substrate-a and two types of P1 binding elements. Location of P1 element in the secondary structure of the *Tetrahymena* group I ribozyme are shown in Figure S1. (**C**) Cleavage of substrate-a by the wild-type ribozyme unit in a given kUrd without (top) or with (bottom) its partner kUrds in the presence of 3 mM Mg2+ . (**D**) Cleavage of substrate-a by the M1 type ribozyme unit in a given kUrd without (top) or with (bottom) its partner kUrds in the presence of 5 mM Mg2+ .

∆ We developed an assay system with which each ∆P5-P5abc ribozyme unit in kUrd oligomers could be analyzed separately. In the ∆P5-P5abc ribozyme unit, the P1 substrate recognition element for the cleavage reaction could be introduced in a modular manner. In EMSA and AFM analysis, α- and β-RNA chain RNAs commonly possessed a P1 recognition element for type-a substrate. To analyze the single ribozyme units in kUrds and in their oligomers selectively, we employed the second P1 element recognizing type-b substrate (Figure 6B) [47]. The substrate recognition of type-a and type-b P1 elements are highly orthogonal to each other [37]. The type-a substrate-P1 pair was introduced selectively to one of six ribozyme units, while the remaining five units were designed to possess type-b P1 elements (Figure 6B). The cleavage reaction of type-a substrate thus, enabled us to selectively monitor the catalytic ability of a ribozyme unit with a type-a P1 element (Figure 6B).

In the isolated state of each kUrd, its wild-type ∆P5 ribozyme module possessing the type-a P1 element showed no activity in the presence of 3 mM Mg2+ (Figure 6C, top). Addition of the partner kUrds in which the ∆P5 ribozyme module possessed only type-b P1 elements, however, induced the catalytic ability to cleave the type-a substrate (Figure 6C, bottom). In a similar manner, the M1 type ribozyme unit in each kUrd was also inactive with 5 mM Mg2+ (Figure 6D, top) but activated by the addition of partner kUrds (Figure 6D, bottom). Such activation behavior was primarily conducted by the association of ∆P5 catalytic modules in each kUrd with P5abc activator modules with the partner kUrds. It should be noted that moderate differences were observed in the catalytic activities among the three ribozyme units sharing the wild-type (unit 2, 3 and 4) or M1 type (unit 2′ , 3′ and 4′ ) ∆P5 module (Figure 6C,D). The catalytic activity of the ∆P5-P5abc ribozyme unit was governed by the identity of the ∆P5 module and also by the identity of the RNA–RNA interface between the ∆P5 module and the P5abc module (Figure 6A) [35,36].

#### *3.7. Trans-Splicing Reaction Promoted by kUrd Oligomer Formation*

The intron form of ∆P5 core ribozyme module can perform the self-splicing reaction to yield ligated exons in the presence of the P5abc activator module (Figure 7A) [18]. The pair of ∆P5 core ribozyme modules in each kUrd were composed of α- and β-chain RNAs. In the intron form of wild-type ∆P5 core ribozyme module on one side of kUrd, 5′ and 3′ exons were provided by α- and β-chain RNAs, respectively (Figure S10A). In the intron form of the M1 type ∆P5 core ribozyme module on the opposite side of kUrd, 5′ and 3′ exons were provided by β- and α-chain RNAs, respectively (Figure S6B). The organization of αand β-chain RNAs in the intron form of the ∆P5 core ribozyme modules would conduct *trans*-splicing reaction joining 5′ and 3′ exons, which were provided by two separate RNA chains, to produce a mature exon sequence (Figure S10A,B) [31,48]. *Trans*-splicing promoted by group I ribozyme has been recognized as a promising strategy to repair mRNAs of disease-related genes bearing undesirable mutations [48–50]. Therefore, we examined *trans*-splicing reaction catalyzed by a kUrd in triblock copolymers (**3-K-2**′ **:2-K-4**′ **:4-K-3**′ ) including the closed trimer. ∆ α β ∆ ′ ′ α β ∆ ′ ′ β α α β ∆ ′ ′ **′ ′ ′**

′ ′ ∆ **Figure 7.** *Trans*-splicing reaction by the ribozyme unit 2 formed in the closed kUrd heterotrimer. (**A**) Parent *cis*-splicing reaction of the bimolecular group I ribozyme. 5′ and 3′ exons were both provided by the unimolecular ∆P5 ribozyme RNA, the activity of which is repressed in the absence of the P5abc RNA. Splicing reaction produced Spinach RNA as the ligated exons. Spinach RNA captured DFHBI and induced its fluorescence. (**B**) Closed kUrd trimers and their partial dimers used for analysis of the *trans*-splicing reaction. (**C**) Time-dependent increases in fluorescence of Spinach RNA–DFHBI complex. Splicing reactions were carried out at 37 ◦C.

∆

β **′**

We installed 5′ and 3′ exon sequences and their recognition elements into the α- and β- chain RNAs of **2-K-4**′ kUrd, respectively (Figure S10C and Table S2). Exon sequences were appended to the ribozyme module consisting of the wild-type ∆P5 ribozyme module activated by the P5abc module provided by **3-K-2**′ kUrd through the **2:2**′ interface (Figure 7B and Figure S10C). We used a shortened derivative of the Spinach RNA aptamer as a ligated exon sequence dissected in precursor form (Figure 7A) [51]. Spinach RNA is an RNA receptor for DFHBI and its fluorescence is induced by restricting the molecular motion of DFHBI in the complex form [52]. The progression of *trans*-splicing reaction can be monitored in the presence of DFHBI [51,53]. To evaluate the effects of kUrd oligomer formation on the *trans*-splicing reaction, we prepared a set of kUrd variants some of which possessed 5′ and 3′ exon sequences (Figure 7B and Figure S10, see also Table S2). A trio of kUrds (**3-K-2**′ **:2-K-4**′ **:4-K-3**′ ) to form a triblock oligomer (**trimer-1**) exhibited an increase in emission (green plot in Figure 7C). We also examined two sets of triblock oligomers (**trimer-2** and **trimer-3**) in which the 5′ exon and 3′ exon were installed to incorrect RNA chains and the two exons were unable to meet in one ∆P5 ribozyme unit. The two sets of kUrds showed no increase in fluorescence from Spinach RNA complexed with DFHBI (purple plot and blue plot in Figure 7C). These data indicated that the correct assignments of 5′ and 3′ exons in six RNA chains are crucial in the oligomeric state of kUrds.

To evaluate the effects of triblock oligomerization (**3-K-2**′ **:2-K-4**′ **:4-K-3**′ ) in the splicing proficient trimer (**trimer-1**), we assayed a solution containing **2-K-4**′ and **3-K-2**′ , which form **dimer-1**, in which the intron form of the ribozyme unit in **2-K-4**′ was formed correctly and **2-K-4**′ was correctly activated by the P5abc module in **3-K-2**′ (Figure 7B). The difference in activity between the presence and absence of the third kUrd unit (**4-K-3**′ ) mainly reflected the oligomerization ability. The solution of **dimer-1** exhibited emission (red plot in Figure 7C), which was, however, about two thirds that of the solution containing three kUrds. This observation suggested that **4-K-3**′ improved the activity of the splicing ribozyme unit through the formation of oligomeric states. To examine the contribution of the P5abc module provided by **3-K-2**′ , we depleted the **3-K-2**′ unit from the three kUrd components. The sample solution containing **2-K-4**′ and **4-K-3**′ to form **dimer-2** showed no increase in emission (orange plot in Figure 7C), indicating the importance of activation by the P5abc module to conduct full splicing ability of the ∆P5 ribozyme module possessing 5 ′ and 3′ exons.

#### **4. Discussion**

In this study, we performed molecular modeling of kUrd to design a closed trimer, the formation of which was confirmed by EMSA and AFM analyses. On the other hand, significant formation of higher oligomers indicated that the current structural design was insufficient partly due to the 3D structure of the full-length *Tetrahymena* ribozyme used for the molecular design in this study. In molecular modeling of kUrds (a dimeric form of an engineered *Tetrahymena* ribozyme connected by the kink-turn motif), we used a model 3D structure of the full-length ribozyme and the X-ray crystallographic structure of a shortened form of the ribozyme lacking some peripheral elements [29]. There may have been some deviations from the full-length *Tetrahymena* ribozyme 3D structure, which has yet to be solved by X-ray crystallography. Improvement of 3D structural design, however, may be possible because a new 3D model structure of the full-length ribozyme was constructed recently by combining cryo-EM data, chemical probing data and molecular modeling [54]. Although the resolution of this new model is modest (6.8 Å), it will be useful to refine the design of ribozyme-based nanostructures.

The formation of the closed dimer is also an issue to be resolved for precise design of kUrd-based RNA nanostructures. Formation of the closed kUrd dimer, which was unexpected from molecular modeling, suggested the structural flexibility of kUrd RNAs and entropic effects in RNA assembly. The flexible nature of the kink-turn RNA motifs has been analyzed and reported previously [55–57]. Therefore, we hypothesized that the formation of the closed dimer may also be supported by the structural flexibility of the KT-15 kink-turn motif. Distinct from our previous Urd with an extended linker, each kUrd can accept L7Ae protein at its linker element bearing the KT-15 kink-turn motif. EMSA of the heterotrimer in the presence of L7Ae suggested the formation of KT-15–L7Ae complex at each corner of the triangle-shaped closed kUrd trimer. We also performed AFM analysis of the heterotrimer in the presence of L7Ae but no significant differences were observed between images of kUrds with and without L7Ae. This was conceivable because L7Ae is much smaller than kUrd RNA. We also found that L7Ae protein strongly inhibited the catalytic ability of kUrds. Collaboration between L7Ae protein and the *Tetrahymena* ribozyme unit in catalysis is still an open issue.

In this study, we mainly analyzed the closed trimer and closed dimer of kUrds. The closed trimer contained six ribozyme units and three kink-turn motifs that interacted with L7Ae proteins. In reactions promoted by the ribozyme unit in kUrd oligomers, *trans*splicing reaction proceeded more efficiently in the presence of three kUrds forming a heterotrimer than two kUrds providing the core ∆P5 ribozyme and P5abc activator. Although the extent of improvement was moderate, this was a promising result as an initial step toward RNA processing in vivo controlled by ribozyme-based nanostructures. The kUrd trimer containing six group I ribozyme units consists of nearly 2400 nucleotides, which is comparable to bacterial 16S and 23S ribosomal RNAs. The size of the triangle shape (with an edge of approximately 34 nm) was also comparable to the bacterial ribosome (approximately 20 nm in diameter). Although the structure and function of kUrd-based nanostructures are still primitive, this study represents an initial step toward artificial construction of complex RNA nanostructures capable of playing crucial roles in naturally occurring biological systems. *Trans*-splicing ribozymes have been investigated as a promising tool to repair mutated mRNAs causing genetic disorders [49,50]. Ribozyme dimers employed in this study may be used to repair two distinct mutated mRNA simultaneously and cooperatively. Two ribozyme units in a dimer also target the same mRNA sequence, to which two *trans*-splicing ribozyme units perform distinct splicing reactions to yield different products. For example, one ribozyme unit repairs the mutation of the target mRNA whereas the other ribozyme unit attaches a fluorescent RNA aptamer to the target mRNA for its fluorescent imaging. Fabrication of closed trimers using ribozyme dimers and KT-15 kink-turn would enable us to attach various protein components through L7Ae protein that binds to KT-15 [38,39]. Such ribozyme-based RNA-protein nanoparticle can be further decorated by RNA-based sensors and therapeutic RNA modules such as RNA aptamers and siRNAs, with which next generation of RNA-based nanomedicines may be developed.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/2076 -3417/11/6/2583/s1, Figure S1: Sequences and secondary structures of the *Tetrahymena* group I ribozymes and their structural elements used for modular engineering in this study, Figure S2: Scheme of rational redesign of unimolecular *Tetrahymena* ribozymes to generate kUrd **1-K-1**′ , Figure S3: Kink-turn unit ribozyme dimers (kUrds) employed in this study, Figure S4: Formation of kUrds through the assembly of their <sup>α</sup>-chain and <sup>β</sup>-chain RNAs, Figure S5: Oligomerization of **1-K-1**′and its mutants, Figure S6: EMSA of triblock copolymers formed in the presence of 20 mM Mg2+, Figure S7: AFM images of the closed heterotrimer and heterodimer and their cross sections, Figure S8: AFM images of the closed heterodimer and its mutant monomers, Figure S9: Complex formation of L7Ae proteins with KT-15 kink-turn motifs in a closed kUrd heterotrimer, Figure S10: Closed kUrd trimers and their partial dimers used for analysis of the *trans*-splicing reaction.

**Author Contributions:** Conceptualization, Y.I.; methodology, Y.I., H.S. (Hirohide Saito), H.S. (Hiroshi Sugiyama), M.E. and S.M.; investigation, J.A., K.H. and Y.F.; writing—original draft preparation, Y.I. and T.Y.; writing—review and editing, H.S. (Hirohide Saito), H.S. (Hiroshi Sugiyama), M.E. and S.M.; supervision, Y.I., H.S. (Hirohide Saito), H.S. (Hiroshi Sugiyama) and M.E. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by University of Toyama Discretionary Funds of the President "Toyama RNA Collaborative Research" (to Y.I. and S.M.).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


*Article*

## **Chloro-Substituted Naphthyridine Derivative and Its Conjugate with Thiazole Orange for Highly Selective Fluorescence Sensing of an Orphan Cytosine in the AP Site-Containing Duplexes**

#### **Chun-xia Wang** † **, Yusuke Sato , Takashi Sugimoto, Norio Teramae and Seiichi Nishizawa \***

Department of Chemistry, Graduate School of Science, Tohoku University, Sendai 980-8578, Japan; cxwang@iccas.ac.cn (C.-x.W.); yusuke.sato.a7@tohoku.ac.jp (Y.S.); sgmt6600@gmail.com (T.S.); norio.teramae.d3@tohoku.ac.jp (N.T.)

**\*** Correspondence: seiichi.nishizawa.c8@tohoku.ac.jp

† College of New Energy and Materials, China University of Petroleum Beijing, Beijing 102249, China.

Received: 21 May 2020; Accepted: 13 June 2020; Published: 16 June 2020

**Abstract:** Fluorescent probes with the binding selectivity to specific structures in DNAs or RNAs have gained much attention as useful tools for the study of nucleic acid functions. Here, chloro-substituted 2-amino-5,7-dimethyl-1,8-naphthyridine (ClNaph) was developed as a strong and highly selective binder for target orphan cytosine opposite an abasic (AP) site in the DNA duplexes. ClNaph was then conjugated with thiazole orange (TO) via an alkyl spacer (ClNaph–TO) to design a light-up probe for the detection of cytosine-related mutations in target DNA. In addition, we found the useful binding and fluorescence signaling of the ClNaph–TO conjugate to target C in AP site-containing DNA/RNA hybrid duplexes with a view toward sequence analysis of microRNAs.

**Keywords:** fluorescent probe; conjugate; abasic site; DNA; microRNA

#### **1. Introduction**

Much attention has been paid to the design of fluorescent probes capable of selective binding to specific structures in DNAs and RNAs [1,2]. This class of fluorescent probes has been designed to bind abasic (apyrimidinic or apurinic; AP) sites [3–8], bulges [9,10], mismatched sites [11,12] and overhanging structures [13–15], by which the biological functions of these noncanonical base-pairs have been examined. Moreover, these fluorescent probes have great potential to be applied for the analysis of target DNAs and RNAs based on their binding-induced fluorescence signaling. These probes also serve as the affinity-labeling agents in the label-free assays for detecting various analytes [16–18].

We developed AP site-binding ligands (APLs) that are conjugated with thiazole orange (TO) for the fluorescence sensing of orphan nucleobases in DNA/DNA duplexes (Figure 1A) [19,20]. APLs can form the pseudo-base pairing with target orphan nucleobases, which allows the selective binding to the target nucleobase opposite the AP site [3–7]. On the other hand, TO unit connected with the APL unit through an appropriate linker can function as a fluorescent intercalator [21,22]. The TO unit alone shows negligible fluorescence, but its fluorescence is greatly enhanced upon intercalation into the base pairs near the AP site. APL–TO conjugates thus enable the light-up sensing of target orphan nucleobase in the AP site-containing DNA (AP–DNA) duplexes. Such light-up probes are useful for a more sensitive analysis compared to fluorescence quenching probes. We succeeded in the detection of single base mutation in target DNA sequences based on the binding and light-up functions of APL–TO conjugates in combination with an AP–DNA probe (cf. Figure 1A). In addition, these conjugates were applicable to the sequence-selective analysis of microRNAs, a class of small

non-coding RNAs associated with various diseases and cancers [23], by the selective binding of conjugates to target orphan nucleobases in the AP site-containing DNA/RNA (AP–DNA/RNA) hybrids. The properties of nucleobase-selectivity and fluorescence emission wavelength are tunable by adopting suitable APLs and cyanine dyes, respectively, which enables the development of multiplex analysis of target sequences by the simultaneous use of these conjugates in a single solution [19]. In this context, 2-amino-5,6,7-trimethyl-1,8-naphthyridine (ATMND, Figure 1B) was previously used as an APL unit in the conjugate with TO unit for the light-up sensing of target orphan C in the AP–DNA and AP–DNA/RNA hybrid duplexes [19]. The binding selectivity of ATMND toward target C can be rationalized by the complementary base pairing of the N1-protonated form of ATMND with target C through hydrogen bonding (Figure 1B) [3]. However, ATMND inherently has only moderate selectivity for C over T because of the possible protonation at N8 that allows the recognition of T (Figure 1B). This should limit the application of the conjugate for the sequence analysis of target DNAs and microRNAs such as the discrimination of C against T or U. – –

–

– –

– **Figure 1.** (**A**) Schematic illustration of the analysis of target DNA or microRNA sequences based on the binding of abasic (AP) site-binding ligands conjugated with thiazole orange conjugates (APL–TO) in combination with an AP site-containing probe for hybridization; (**B**) Chemical structures of chloro-substituted 2-amino-5,7-dimethyl-1,8-naphthyridine (ClNaph) and its related derivatives. Proposed binding of these ligands with cytosine and thymine (two possible patterns) are also shown.

*ε* ≥ Ω This work describes the enhanced binding selectivity of a 2-amino-1,8-naphthyridine derivative for orphan C by the incorporation of a chloro group. We previously found that the introduction of the electron-withdrawing trifluoromethyl group into 2-amino-7-methyl-1,8-naphthrydine (AMND) led to remarkably enhanced C/T selectivity due to more favorable protonation at N1 compared to N8 [24]. However, the obtained CF3-AMND (Figure 1B) showed much weaker affinity compared to ATMND. Herein, 2-amino-5,7-dimethyl-1,8-naphthyridine (ADMND), an AMND derivative with an additional methyl group, was explored as the scaffold for the introduction of an electron-withdrawing group considering the fact that ADMND can bind more strongly to target C than AMND [3]. The resulting compound carrying the chloro group at position 6 in the naphthyridine ring (ClNaph) was found to exhibit the strong binding affinity as well as the high selectivity for target C. The use of ClNaph as an APL unit in the conjugate with TO was shown to be useful for the design of a light-up probe toward target C over other nucleobases in the AP–DNA duplexes and AP–DNA/RNA hybrids with a view toward the detection of C-related mutations in DNA and microRNA sequences.

#### **2. Materials and Methods**

#### *2.1. Materials*

All of the DNAs and RNAs were purchased from Nihon Gene Research Laboratories, Inc. (Sendai, Japan) and Sigma-Genosys (Hokkaido, Japan), respectively. The other reagents were commercially available and used without further purification. The concentration of DNAs and RNAs were determined from the molar extinction coefficient at 260 nm (ε260) according to the literature [25]. Water was deionized (≥18.0 MΩ cm specific resistance) by an Elix 5 UV water purification system and a Milli-Q synthesis A10 system (Millipore Corp., Bedford, MA, USA). The other reagents were purchased from standard suppliers and used without further purification. <sup>1</sup>H NMR spectra were measured with a JEOL

ECA-600 spectrometer at 500 MHz. High-resolution ESI-MS spectra were measured with a Bruker APEX III mass spectrometer.

Unless otherwise mentioned, all measurements were performed in 10-mM sodium cacodylate buffer solutions (pH 7.0) containing 100-mM NaCl, 1.0-mM EDTA and ethanol (<2%). Before the measurements, target duplex-containing samples were annealed as follows: heated at 75 ◦C for 10 min and gradually cooled to 5 ◦C (3 ◦C/min), after which the solution temperature was raised again to 20 ◦C. The probe was then added to the samples.

#### *2.2. Probe Synthesis*

ClNaph: 2,6-diaminopyridine (3.17 g, 29.1 mmol) in phosphoric acid (50 mL) was added to 3-chloropentane-2,4-dione (3.99 g, 29.6 mmol) and the reaction mixture was stirred at 90 ◦C for 24 h. After neutralization with NaOH, the mixture was filtered. The brown residue was extracted with 300 mL of CHCl<sup>3</sup> three times and the organic phase was dried over Na2SO4. The solvent was evaporated *in vacuo*, affording ClNaph as a dark brown solid (3.57 g, 17.2 mmol, 59.2%). <sup>1</sup>H-NMR (CDCl3): 8.03 (d, 1H, *J* = 8.4 Hz), 6.76 (d, 1H, *J* = 9.2 Hz), 2.75 (s, 3H), 2.64 (s, 3H). High resolution ESI-MS calcd ([M + H]+) 207.0563; found, 207.0569.

ClNaph–TO and ClNaph–TO2 conjugates: The conjugates were synthesized according to our previous report [19]. Briefly, aminoethyl group was incorporated into ClNaph and the resulting derivative was attached with the carboxylate-terminated decanyl (C10) spacer-containing TO derivative [26,27]. ClNaph–TO conjugate, <sup>1</sup>H-NMR (DMSO-*d*6) 8.80 (d, 1H, *J* = 8.8 Hz), 8.60 (d,1H, *J* = 7.2 Hz), 8.13 (d, 1H, *J* = 8.8 Hz), 8.05 (d, 1H, *J* = 7.6 Hz), 8.00 (t, 1H, *J* = 7.6 Hz), 7.80 (d, 1H, *J* = 8.4 Hz), 7.76 (t, 1H, *J* = 8.0 Hz), 7.63 (t, 1H, *J* = 8.4 Hz), 7.41 (t,1 H, *J* = 8.0 Hz), 7.36 (d, 1H, *J* = 7.6 Hz), 6.93 (s, 1H), 6.78 (d, 1H, *J* = 9.2 Hz), 4.58 (t, 2H, *J* = 7.2 Hz), 3.44–3.47 (q, 4H), 2.58 (s, 3H), 2.54 (s, 3H), 2.05 (t, 2H, *J* = 7.2 Hz), 1.84 (t, 2H), 1.42 (t, 2H), 1.34–1.16 (q, 14H). High resolution ESI-MS calcd ([M + H]+) 707.3299; found, 707.3292. ClNaph–TO2 conjugate, <sup>1</sup>H-NMR (DMSO-*d*6) 8.72 (d, 1H, *J* = 8.8 Hz), 8.61 (d, 1H, *J* = 6.8 Hz), 8.07–8.01 (m, 4H), 7.78–7.74 (m, 2H), 7.62–7.58 (t, 1H, *J* = 7.6 Hz), 7.43–7.37 (m, 2H), 6.92 (s, 1H), 6.77 (d, 2H, *J* = 4.0 Hz), 4.61 (t, 2H), 4.17 (s, 3H), 3.50–3.42 (m, 4H), 2.58 (s, 3H), 2.55 (s, 3H), 2.00 (t, 2H, *J* = 7.2 Hz), 1.59–1.12 (m, 18H). High resolution ESI-MS calcd ([M + H]+) 707.3299; found, 707.3286.

#### *2.3. Fluorescent Measurements*

Fluorescence spectra were measured with a JASCO model FP-6500 spectrofluorophotometer equipped with a thermoelectrically temperature-controlled cell holder (Japan Spectroscopic Co., Ltd., Tokyo, Japan) using a 3 × 3 mm quartz cell.

The dissociation constant (*K*d) of the probe was determined at 20 ◦C by fluorescent titration experiments. The changes in fluorescence intensity were analyzed by nonlinear least-squares regression based on a 1:1 binding isotherm [19]. Errors in the *K*<sup>d</sup> values are the standard deviations obtained from three independent experiments (*N* = 3).

#### **3. Results and Discussion**

First, we examined the binding ability of ClNaph (500 nM) to target orphan nucleobases in model 21-meric AP–DNA duplexes (500 nM; 5′ -d(GCA GCT CCC A**X**A GTC TCC TCG)-3′ /3 ′ -d(CGT CGA GGG T**N**T CAG AGG AGC)-5′ , **X** = AP site (Spacer C3, a propyl residue), **N** = target nucleobase; G, C, A or T) by the fluorescence measurements. As shown in Figure 2, ClNaph showed the emission with the maximum at 393 nm in the absence of DNAs. The addition of AP–DNA duplexes caused a decrease in the fluorescence intensity of ClNaph. The largest fluorescence quenching response was observed for target C, which indicates that ClNaph shows the preferential binding to target C compared to target G, A and T. The binding affinity of ClNaph was assessed by the fluorescence titration experiments (Inset of Figure 2). ClNaph showed the quenching response for target DNA duplexes in a concentration-dependent manner and the resulting titration curves were analyzed by a

1:1 binding model for the determination of the dissociation constants (*K*d). The *K*<sup>d</sup> value for C was obtained as 70 ± 5.3 nM. Significantly, this affinity is much stronger compared to the parent ADMND and the previously developed CF3-AMND and is almost comparable to ATMND (Table 1). In addition, ClNaph showed useful binding selectivity toward target C, in which the *K*<sup>d</sup> value for target C was two orders of magnitude smaller than those for other target nucleobases (*K*d/nM: T, 2020 ± 240, G and A > 5000). The observed selectivity for C over T is much superior to ADMND and ATMND [3] whereas it is slightly moderate in comparison with CF3-AMND [24]. We reason that this is due to the more favorable protonation of ClNaph at the N1 position for the complementary base-pairing with orphan C compared to the N8 protonation (Figure 1B), where the electron-withdrawing effect of the chloro group would be essential as was observed for CF3-AMND [24]. We note that the type of the AP site affects the binding of ClNaph. The use of Spacer C3 (propyl residue) allowed the stronger binding to target C compared to the tetrahydrofuranyl residue (dSpacer), presumably due to less steric hindrance (Figure S3). The observed strong and highly selective binding properties for target C should render ClNaph a useful APL unit in the conjugate with TO unit.

– **Figure 2.** Fluorescence response of ClNaph (500 nM) to target AP–DNA duplexes (500 nM). Inset: Titration curves for the binding of ClNaph (500 nM) to target nucleobases. Measurements were done in solutions buffered to pH 7.0 (10-mM sodium cacodylate) containing 100-mM NaCl and 1.0-mM EDTA. *F* and *F*<sup>0</sup> denote the fluorescence intensities of ClNaph in the presence and absence of DNA duplexes. Excitation, 350 nm. Analysis, 393 nm. Temperature, 20 ◦C.

– **Table 1.** Dissociation constants (*K*d) of ClNaph and its related derivatives for target C in 21-meric AP–DNA duplexes a .


<sup>a</sup> *K*<sup>d</sup> values measured in solutions buffered to pH 7.0 (10-mM sodium cacodylate) containing 100-mM NaCl and 1.0-mM EDTA. Temperature, 20 ◦C. <sup>b</sup> Values taken from [3]. <sup>c</sup> Value taken from [24].

– – – ClNaph was coupled with the quinolone ring of the TO unit through a long alkyl (C10)-linker according to our previous report [19], which affords the ClNaph–TO conjugate (Figure 3A). We measured the fluorescence response of the TO unit of the conjugate (500 nM) for the same AP–DNA duplexes (500 nM) as those used for the examination of ClNaph (cf. Figure 2). As shown in Figure 3A, TO unit shows negligible emission in the absence of DNAs due to the free rotation of the benzothiazole and quinolone rings [21]. In contrast, we observed a remarkable light-up response of the TO unit for target C-containing AP–DNA duplex, in which the fluorescence intensity at 526 nm increased by 207-fold. This can be explained by the restriction of the rotation of the TO unit by intercalation into the duplex region (cf. Figure 1A) [19]. In addition, the fluorescence titration experiments revealed that the *K*<sup>d</sup> value of ClNaph–TO conjugate for target C reached 6.6 ± 0.3-nM (Figure S4). This affinity is

– –

one order of magnitude stronger than ClNaph itself (cf. Table 1). Apparently, the conjugation with the TO unit led to the enhanced binding affinity for target C in the AP–DNA duplexes. It should be noted that ClNaph–TO conjugate has high binding selectivity for target C over other three nucleobases (Figure S4: *K*<sup>d</sup> values for T, G and A > 1500 nM). Considering that TO lacks the selectivity for the orphan nucleobases [19], the ClNaph unit is responsible for the observed C-selectivity of the conjugate. This was confirmed by the fluorescence quenching response of the ClNaph unit with high selectivity to target C (Figure S5). These results showed the useful binding and light-up functions of the ClNaph–TO conjugate for the detection of C-related mutations in target DNA sequences. It is also noteworthy that the strong emission of ClNaph–TO conjugate for target C can be seen even with the naked eyes under UV light irradiation (Figure 3B). Fluorescence response for target C is clearly distinguishable from those for other target nucleobases, which thus facilitates a simple and rapid analysis of target DNA sequences. – – – – – – – –

–

–

–

**Figure 3.** (**A**) Fluorescence response of ClNaph–TO conjugate (500 nM) to target AP–DNA duplexes (500 nM). Excitation, 504 nm. Temperature, 20 ◦C. (**B**) Images of fluorescence emission of ClNaph–TO conjugate (500 nM) in the absence and presence of target AP–DNA duplexes (500 nM) under excitation light with the wavelength of 365 nm, obtained by a digital camera at room temperature. Other solution conditions are the same as those given in Figure 2.

We found that the kind of heterocycles of the TO unit, benzothiazole or quinolone rings, that connect to ClNaph unit in the conjugate affected the binding and fluorescence signaling abilities for the binding to AP–DNA duplexes. When the C10-linker was appended to the benzothiazole ring, the resulting conjugate (ClNaph–TO2) showed the selective light-up response for target C in the AP–DNA duplexes (Figure S6); however, the degree of the response was smaller compared to ClNaph–TO (cf. Figure 3A). This is attributable to the reduced binding affinity of ClNaph–TO2 conjugate to target C

(*K*<sup>d</sup> = 71 nM). This indicates that the conjugation of ClNaph unit into the benzothiazole ring hinders the effective intercalation of the TO unit for DNA duplexes as reported in the literature [27]. In addition, we observed moderate selectivity of ClNaph–TO2 conjugate for target C over other three nucleobases in AP–DNA duplexes (Figure S6) while the reason for this is unclear yet. Hence, the connection of ClNaph unit to the quinolone ring of the TO unit is effective for the strong binding and large light-up response of the conjugate for target C in the AP–DNA duplexes. – – – – –

As described above, the ClNaph–TO conjugate can serve as a useful light-up probe for target C in the AP–DNA duplexes. Meanwhile, we noticed weak binding of the conjugate for AP–RNA duplexes (Figure 4A and Figure S7). The binding affinity was estimated as > 1300-nM for target C in the AP–RNA duplex (5 ′ -r(GCA GCU CCC A**X**A GUC UCC UCG)-3 ′ /3 ′ -r(CGU CGA GGG U**C**U CAG AGG AGC)-5 ′ , **X** = AP site (Spacer C3), **C** = target nucleobase), which was three orders of magnitude larger than that for the AP–DNA duplex (cf. Figure S4). We consider that this arises from the preferential binding of the ClNaph–TO conjugate to B-formed AP–DNA duplexes relative to A-formed AP–RNA duplexes [6]. However, when targeting C in single stranded RNA, the use of AP–DNA probe is highly effective. The resulting duplex is the hybrid between RNA and AP–DNA ((5 ′ -d(GCA GCT CCC A**X**A GTC TCC TCG)-3 ′ /3 ′ -r(CGU CGA GGG U**C**U CAG AGG AGC)-5 ′ , **X** = AP site (Spacer C3), **C** = target nucleobase) that can adopt A-form/B-form intermediate structure [28], under the same measurement conditions used to examine AP–RNA duplexes. As shown in Figure 4B, we observed the significant light-up response of the conjugate for the AP–DNA/RNA hybrid, in which the response was 22-fold larger compared to that for the AP–RNA duplex. The *K*<sup>d</sup> for target C in the hybrid was obtained as 22 nM (Figure S7), where the affinity is much superior to that for AP–RNA duplexes. Importantly, the ClNaph–TO conjugate retains its selectivity for target C over U opposite the AP site in the DNA/RNA hybrid, as was observed in AP–DNA duplexes (cf. Figure 3B). These results suggest the potential use of ClNaph–TO conjugate for the detection of microRNAs in combination with AP–DNA hybridization probe (cf. Figure 1A). – – – ′ ′ ′ ′ – – – – – – ′ ′ ′ ′ – – – – – – – –

– – – **Figure 4.** Fluorescence response of ClNaph–TO conjugate (500 nM) for target C or U in AP site-containing duplexes (500 nM): (**A**), AP–RNA duplex and (**B**) AP–DNA/RNA hybrid. Other solution conditions are the same as those given in Figure 2. Excitation, 504 nm. Temperature, 20 ◦C.

– – – – – We performed the preliminarily experiments for detection of microRNAs based on the binding-induced light-up response of ClNaph–TO conjugate for AP–DNA/RNA hybrids. A 21-meric AP–DNA probe was designed for the selective detection of the let-7d sequence among let-7 family [29], as shown in Figure 5A. Hybridization between this probe and let-7d allows the construction of the AP–DNA/RNA hybrid containing an orphan C opposite an AP site. Meanwhile, the hybridization with other let-7 sequences leads to the formation of orphan U-containing hybrids with several mismatch base pairs (Table S1). We observed the significant light-up response of ClNapht-TO for the let-7d-containing hybrid (Figure 5A). This response is much larger than those for other let-7 sequences, which clearly shows that ClNaph–TO conjugate enables the selective detection probe for let-7d over other let-7 members. The binding affinity of the conjugate for the let-7d-containing hybrid (*K*<sup>d</sup> = 23 nM) was

found to be comparable to that for the model AP–DNA/RNA hybrid (cf. Figure S8). Our assay can be applied to the analysis of various microRNAs by using the AP site-containing DNA probe whose sequence is designed so as to be complementary to the target sequence. Figure 5B shows the fluorescence response of the conjugate in combination with the 22-meric AP–DNA probe for let-7i. Selective light-up detection for let-7i was achieved due to the selective binding of the conjugate to target C in the AP–DNA/RNA hybrid formed between the AP–DNA probe and let-7i. These results show the applicability of the ClNaph–TO conjugate for the selective detection of target microRNA sequences based on the hybridization for the construction of the AP–DNA/RNA hybrid as well as the binding-induced light-up response of the conjugate for the orphan C in the resulting hybrid. – – – – – –

– – – – **Figure 5.** Selective detection of (**A**) let-7d and (**B**) let-7i sequences by ClNaph–TO conjugate in combination with AP–DNA probe (X = AP site (Spacer C3)). Sequences of other let-7 members are shown in Table S1. [ClNaph–TO], [target microRNA], [AP–DNA probe] = 500 nM. Other solution conditions for the fluorescence response of the conjugate are the same as those given in Figure 2. Excitation, 504 nm. Temperature: 20 ◦C.

#### **4. Conclusions**

– – In summary, we report that ClNaph served as a strong and highly selective binder for the orphan C opposite an AP site in DNA duplexes. In addition, we demonstrated the usefulness of ClNaph as the APL unit in the conjugate with a TO unit for the design of light-up probes for the detection of C-related mutations in DNA and microRNA sequences. These results obtained could provide insights needed to design this class of light-up conjugates suitable for the analysis of DNA and microRNA sequences. As shown in our previous works [19,20], the spacer length and structure have a large impact on the binding and fluorescence sensing abilities of APL–TO conjugates. We will examine these concerns for further improvement of the APL–TO conjugates in order to develop the practical assays.

– – – – – – – – **Supplementary Materials:** The following are available online at http://www.mdpi.com/2076-3417/10/12/4133/s1, Figure S1: <sup>1</sup>H NMR spectra of the conjugates; Figure S2: ESI-MS spectra of the conjugates; Figure S3: Fluorescence response of ClNaph to target dSpacer-containing AP–DNA duplexes. Inset: Titration curves for the binding of ClNaph to target nucleobases; Figure S4: Titration curves for the binding of ClNaph–TO conjugate to target nucleobases; Figure S5: Fluorescence response of ClNaph unit in the conjugate to target AP–DNA duplexes. Figure S6: Chemical structure of ClNaph–TO2 conjugate and its fluorescence response to target AP–DNA duplexes; Figure S7: Titration curves of the binding of ClNaph–TO conjugate to target C in the AP–DNA/RNA hybrid and AP–RNA duplex; Table S1: Sequences of let-7 members used in this study. Figure S8. Titration curves for the binding of ClNaph–TO conjugate to the let-7d/DNA probe hybrid.

– **Author Contributions:** C.-x.W. and Y.S. conceived and designed the experiments. C.-x.W. and T.S. performed the experiments. C.-x.W. and Y.S. wrote the study. N.T. and S.N. supervised the project. All authors have read and agreed to the published version of the manuscript.

**Funding:** Y.S., N.T. and S.N. acknowledge the support from Scientific Research (S) (No. 22225003) from Japan Society for the Promotion of Science (JSPS).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Applied Sciences* Editorial Office E-mail: applsci@mdpi.com www.mdpi.com/journal/applsci

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18

www.mdpi.com ISBN 978-3-0365-1513-7