Application of RNA-DNA Duplex Base Triplets to Antisense Drugs

Sixty-four sets of three-dimensional models of RNA-DNA duplex base triplets were constructed based on codons by homology modeling method using software InsightII on Indigo workstation, which should be helpful for the study of RNA-DNA annealing, the basis of nucleic acids interactions and some peculiar motifs for design antisense oligonucleotides. Our research result reveal that the energies (such as E, Ec, Eb, Et and Enr) of DNA/asRNA hybrids are lower than those of RNA/asDNA hybrids while the energies (such as En and End) of DNA/asRNA hybrids are higher than those of RNA/asDNA hybrids for most binary complex and ternary complex, especially E, Ec, Eb and En in evidence. And the total energy of GGG/CCC hybrid is the lowest of all the hybrids, the more G/C base pairs, the lower of energy of the triplet hybrid and (G/C)3<U(G/C)2<A(G/C)2 <U2(G/C)<AU(G/C)<A2(G/C) in turn; the U-including hybrid system stability, the more of number of uracil (U), the lower of energy of the triplet hybrid; the energy of GU-including hybrid is lower than that of CU-including hybrid no matter for binary complex or ternary complex. G/A/U bases often deviate from base pair planes, which can form hydrogen bonds with neighboring base pairs and affects the stabilities of triplets. Moreover, some peculiar oligodeoxynucleotide sequence motifs that could be derived from corresponding triplet hybrids are divided into two groups: four-member motif and five member motif, where the former comprises eight motifs, namely 5’-TCTT-3’,5’-TGCT-3’,5’-CCTT-3’,5’-CCAT-3’, 5’-CATC-3’, 5’-ATCT-3’, 5’-GCTG-3’ and 5’-GTCT-3’; and the later consists of four motifs, viz 5’-TGCTG-3’, 5’-GTCTT-3’, 5’-CCATC-3’ and 5’-CATCT-3’, which are positively correlated with antisense activities and play an important role in designing antisense drugs.


INTRODUCTION
The essential steps in rational drug design are the identification of an appropriate target responsible for a certain disease and the development of a drug with a specific affinity to that target. One of the most general approaches of drug targeting is the specific manipulation of gene expression at the DNA or RNA stage of protein synthesis. The antisense principle is based on a specific recognition of certain DNA and RNA regions by an antisense oligonucleotide, which inhibits the translation by a selective pairing of the "sense" with the complementary antisense oligonucleotide strand [1,2] . The first step of protein synthesis can be inhibited by triple helix formation and successively blockade of transcription [3,4] . Secondly, the antisense oligonucleotide may interfere with the processing proteins, which are responsible for the transformation of the primary DNA transcript into the maturated mRNA. Formation of a dual strand between the antisense oligonucleotide and the mRNA may disable the transport of the mRNA from the nucleus to the cytoplasm. Dual strand formation in the cytoplasm results in a blockade of protein synthesis in the ribosomes [2,5] . Antisense oligonucleotide technology has been a very important tool in biological research to study and control gene expression and viral functions [6] . During the past years there are substantial data in a wide range of animal models supporting the activity, specificity and therapeutic utility of antisense therapeutics. Significant progress has been reported with regard to understanding the basic acting mechanisms of antisense drugs. Generally, antisense drugs are designed to modulate the information transfer from the gene to protein. Binding of oligonucleotides to specific sequences may inhibit the interaction of the RNA or DNA with proteins, other nucleic acids or other factors required for essential steps in the intermediary metabolism of the RNA or its utilization by the cell [7] . The study of oligonucleotides, which bind to DNA-DNA duplex to form triplex region, has also been reported [8,9] .
In this study, we focus on constructing three dimensional structural models for antisense oligonucleotide complex with their target sequences and provide some peculiar motifs for antisense oligonucleotides design.

MATERIALS AND METHODS
The three-dimensional structures of the 64 RNA-DNA triplets and three antisense drugs were generated using the Biopolymer module of the commercial software packages InsightII 2000 (MSI, St Louis, MI, USA) on a Silicon Graphics Iris Indigo (SGI, Silicon, CA, USA) workstation.
The structures of A-type RNA-DNA hybrid nucleotide pairs in Biopolymer Module were used to construct the 64 triplet models. Each triplet contains two nucleic acid strands (one deoxyribonucleic acid strand and one ribonucleic acid strand). Then, one cation of K+ was added to each phosphate group to keep the whole system electrically neutral. The initial positions of the K+ ions was on the plane formed by the O-P-O atoms of phosphoric acid group and with equal distances to the two oxygen atoms. The energy of these triplets was optimized by molecular dynamics and molecular mechanics using the Discover Module. The Amber force field is adopted. The derivative was set as 0.1. We adopted steepest descents method to minimize these models for 500 steps and then adopted conjugate gradient method to minimize 1000 steps. Finally, the whole system after energy minimization was soaked in a sphere of aqueous solution (radius=5Å) that contains 312 water molecules and forms the ternary complex of water-K+-nucleotide triplet. The energy of this ternary complex was optimized by 200steps of steepest descents method following by 1000 steps of conjugate gradient method and was simulative annealed by molecular dynamics using the amber force field. A time step of 1fs was used during dynamics. The system was heated to 1000K for 1ps and then down to 300K for 50ps. The average conformation of a series of lowest energy conformations was regarded as the preponderant conformation of the complex. Following each dynamics run, the total energy was minimized via mechanics by using a steepest descent algorithm and a subsequent conjugate gradient method. Similar approaches have been used to study the energies and structural characteristics of DNA triplexes [3,9] . Two sets of data about the energy parameters and conformations of 64 triplets in different conditions were acquired to study the energy and structural characteristics of RNA-DNA duplexes.
Similarly, three antisense drugs, ISIS2922 (5'-GCGTTTGCTCTTCTTCTTGCG-3', 21nt, treating CMV-caused retinitis, ISIS Inc.) [10][11][12] , c-myb [5,[13][14][15] (5'-TATGCTGTGCCGGGGTCTTCGGGC-3', 24nt, inhibiting oncogene c-myb) and GEM91 (5'-CTCTCGCACCCATCTCTCTCCTTCT-3', 25nt, treating AIDS, Hybridon Inc.) [16][17][18][19] were used to construct three A-type mRNA/asDNA hybrids using InsightII/Biopolymer Modules described above [4] . Then, one cation of K+ was added to each phosphoric acid group to keep the whole system electricity equilibrium. The energy of these triplets was optimized by molecular dynamics and molecular mechanics using Discover Module. The Amber force field is adopted. At first, fixed the terminal nucleotide pairs and weight atoms. The mRNA/asDNA binary complex were optimized for 200 steps with the steepest descent minimizer and subsequently for 200 steps with the conjugate gradient minimizer. Then, the constraints were removed and computed for 1000 steps with the conjugate gradient minimizer. Secondly, the whole system after energy minimization was soaked in a sphere of aqueous solution (radius 5Å) and forms the ternary complex of water-K+-mRNA/asDNA. The energy of this ternary complex was optimized by 200steps of steepest descents method following with 2000 steps of conjugate gradient method and was simulative annealed by molecular dynamics using the amber force field. A time step of 1fs was used during dynamics integral. The system was heated to 1000K and retained 1ps and then down to 300K to keep 50ps. The average conformation of a series of lowest energy conformation was regarded as the preponderant conformation of the complex. Following each dynamics run, the total energy was minimized via mechanics by using a steepest descent algorithm and a subsequent conjugate gradient method. Similar approaches have been used to study the energies and structural characteristics of DNA triplexes [3,9] .

RESULTS
The energies of 64 K+-RNA/DNA triplet binary complexes, water-K+-triplet ternary complexes and three antisense oligodeoxynucleotide complexes are listed in Table 1 and 2. Here, the coulomb energy (Ec) of the triplet complex mostly contribute to the total energy (E) of the triplet complex and non-bond dispersion energy (End), non-bond repulsion energy (Enr), phi energy (Ep), theta energy (Et), non-bond energy (En), bond energy (Eb), hydrogen bond energy (Eh) and out of plane energy (Eo) contribute to the total energy in turn. The energy unit is kcal/mol. Table 3  displays the Watson-Crick hydrogen bond types and  lengths of 64 RNA-DNA duplex base triplets, table 4 shows the hydrogen bond types and lengths that are between neighboring base pairs and table 5 arranges other hydrogen bond types and lengths that are within the base pairs of 64 RNA-DNA duplex base triplets.

Analysis
of DNA-RNA duplex base triplets: From Table 1, there are two types of RNA/DNA hybrids, namely RNA/asDNA and DNA/asRNA, whose difference center on nucleotide orientation RNA were defined as RNA/asDNA hybrids while those of 5'-3' DNA as DNA/asRNA hybrids. And each RNA-DNA hybrid may be regarded as RNA/asDNA or DNA/asRNA hybrids based on the orientation of RNA or DNA. Our research results reveal that the energies (such as E, Ec, Eb, Et and Enr) of DNA/asRNA hybrids are lower than those of RNA/asDNA hybrids while the energies (such as En and End) of DNA/asRNA hybrids are higher than those of RNA/asDNA hybrids for most binary complex and ternary complex, especially E, Ec, Eb and En in evidence. About Ep, Eo and Eh, there are differences in between binary complex and ternary complex, where these energies of DNA/asRNA hybrids for ternary complex are lower than those of RNA/asDNA hybrids while these energies for binary complex contrary to the ternary complex. This is because there are a great deal of hydrogen bond and hydrophobic interaction in K+-triplet binary complex surrounding by water, which lead to the water-K+triplet ternary complex system energy down and make the system mostly stability. Moreover, the results of paired samples test show that after soaking the K+triplet complexes in explicit water, the value of energy parameters are generally increased except the nonbond dispersion energy parameter; the energy variations are all significant, which supported by Liu's results [3] . The total energy of GGG/CCC hybrid is the lowest of all the hybrids. And the more G/C base pairs, the lower the energy of the triplet hybrid, namely (G/C)3<U(G/C)2<A(G/C)2<U2(G/C)<AU(G/C)<A2(G /C), which is because guanine and cytosine can form three hydrogen bonds leading to lower energy and structure stability. Some key triplets based on RNA are found, which are 5'-AGG-3', 5'-AGC-3', 5'-CGA-3', 5'-CAG-3', 5'-GAC-3' and 5'-GCA-3'. Moreover, the energy of AAA/UUU hybrid is lower than that of AAA/TTT hybrid. The more uracils (U), the lower of energy of the triplet hybrid, where uracil and adenine more easily form two hydrogen bonds than the formation of hydrogen bond between thymine and adenine. It is the reason that there is a space block during the formation of hydrogen bond between thymine and adenine due to 5-methyl of thymine while no at uracil. So it results the U-including hybrid system stability. And we find that the energy of GU-including hybrid based on RNA (such as 5'-UAG-3', 5'-GAU-3', 5'-AGU-3', 5'-UGA-3', 5'-AUG-3' and 5'-GUA-3') is lower than that of CU-including hybrid no matter for binary complex or ternary complex. Some useful triplet hybrids are also revealed, such as 5'-GGU-3', 5'-UGG-3', UGU, 5'-CCU-3', 5'-UCC-3', CUC, 5'-CGU-3', 5'-UGC-3', 5'-CUG-3', 5'-GUC-3', 5'-GCU-3', 5'-UCG-3', etc. These findings above are consistent with the results of cluster analysis where overall energies of 64 RNA-DNA hybrid triplets are divided into four groups according to their overall energies nearing the rescaled distance 2 [20] . The most stable group is composed of triplets that only contain C and G, the group contains 2 G/C pairs in each triplet represents another stable  Note: E (total energy), Eb (bond energy), Et (theta energy), Ep (phi energy), Eo (out of plane energy), Eh (hydrogen bond energy), En (non-bond energy), Enr (non-bond repulsion energy), End (non-bond dispersion energy), Ec (coulomb energy). The energy unit is kcal/mol       group and the group contains 1 G/C pair in each triplet is more unstable than the two former G/C abundant groups, but more stable than the most unstable group (containing only A and U, except UUU). So, we bring a hypothesis that the RNA/asDNA hybrid can be stably formed when G/C base pair continuously emerge more than three times, which is consistent with Zewert's research result [15] where a seven-member G/C sequence motif exist in an antisense oligodeoxynucleotides. A similar finding occurs in triplet DNA technology where the third strand oligodeoxyribonucleic acid as an antisense drug [4,9] ,   which the antisense chain mainly recognize G/C-rich segment, such as SP1 binding site and promoter. But it is different from Matveeva's results, which GGGG, 5'-ACTG-3', 5'-CCGG-3', 5'-TAA-3' and AAA based on DNA are negatively correlated with antisense activities. Moreover, according to Table 1, some corresponding deoxynucleotide triplets could be derived from useful triplet hybrids based on RNA mentioned above, which maybe play an important role in inhibiting mRNA transcript, namely 3'-TCC Here, these motifs are partly consistent with Matveeva's result [21] and are supported by Yamaguchi's result [18] .
Here, there are two triplets, AUG and GUA, different from the results of total energy. But hydrogen bond energy weakly contribute to the total energy while coulomb energy mainly dominate the total energy by the redistributions of charges and the total energy of the triplet complexes are held by coulomb energy.
The loss of hydrogen bonds is often conducted by the base distortion which makes the base deviated from the base pair plane. U is the base that can lead to severe base distortion, especially appearing with A/U, including UUC, UUU, UGA, GUA, AUU, GUA, AUA, AUC and UUA; G abundant triplets are also inclined to deviate from the base pair plane, besides loss of hydrogen bonds, the distortion of G base can often form hydrogen bonds between base pairs, this phenomenon explains why G abundant nucleotides are not often selected as antisense drugs for they are easy to fold with themselves and the consequence is low antisense activities, with the exception of two triplets, AAG and AGA. Moreover, the two triplets also belong to stable groups in vacuum based on hydrogen bond energy, which is similar to that in water. So the relative antisense oligodeoxynucleotide triplets are 5'-CTT-3' and 5'-TCT-3' to 5'-AAG-3' and 5'-AGA-3'. Combined with some deoxynucleotide triplets mentioned above, part of these sequence motifs is found in some antisense drugs, such as GEM91 [18,19] (Fig. 1), ISIS2922 [10] (Fig. 2) and c-myb [5,15] (Fig. 3). Studies of identification of sequence motifs in oligonucleotides whose presence is correlated with antisense activities are proved by what Matveeva et al did [21] . Matveeva et al reported that CCAC, TCCC, ACTC, ATCC, GCCA and CTCT are motifs positively correlated with antisense activities, which is similar to our research finding.
From the finding above, it seems to suggest a same result that G/C-rich and U-including hybrid possess lower energy and some relative oligodeoxynucleotide sequence motif could be derived from useful oligonucleotide triplet hybrids above. ISIS2922 is an antisense oligonucleotide with antiviral activity against cytomegalovirus in phase III clinical test, whose nucleotide acid sequence (5'-GCGTTTGCTCTTCTTCTTGCG-3') includes nine triplets underlined, under-dotted, hatched and framed, where two peculiar motifs were found, 5'-TGCT-3' and 5'-TCTT-3'. An antisense oligodeoxynucleotide c-myb (5'-TATGCTGTGCCGGGGTCTTCGGGC-3') is corresponding to the promoters of the protooncogene c-myb, which is in phase I/II clinical tests, whose sequence includes eight triplets underlined, under-dotted, hatched and framed, where six peculiar motifs were found, 5'-TGCT-3', 5'-GCTG-3', 5'-GTCT-3', 5'-TCTT-3', 5'-TGCTG-3' and 5'-GTCTT-3'. GEM 91 (gene expression modulator) is a 25-mer oligonucleotide phosphorothioate complementary to the gag initiation site of HIV-1. It has been studied in various in vitro cell culture models to examine inhibitory effects on different stages of HIV-1 replication. Experiments were focused on the binding of virions to the cell surface, inhibition of virus entry, reverse transcription (HIV DNA production), inhibition of steady state viral mRNA levels, inhibition of virus production from chronically infected cells and inhibition of HIV genome packaging within virions. Experiments were also performed in vitro in an attempt to generate strains of HIV with reduced sensitivity to GEM 91. Using in vitro methods that were successful in generating HIV strains with reduced sensitivity to AZT, Yamaguchi K et al [18] were unable to generate strains with reduced sensitivity to GEM 91. At present GEM91, which inhibits both HIV adsorption and HIV integrase, is in phase I/II clinical trials. There are six peculiar motifs, 5'-CCAT-3', 5'-CATC-3', 5'-ATCT-3', 5'-CCTT-3', 5'-CCATC-3' and 5'-CATCT-3', in its sequence (5'-CTCTCGCACCCATCTCTCTCCTTCT-3'), which includes thirteen triplets underlined, underdotted, hatched and framed.

CONCLUSION
Our research results reveal that the energies (such as E, Ec, Eb, Et and Enr) of DNA/asRNA hybrids are lower than those of RNA/asDNA hybrids while the energies (such as En and End) of DNA/asRNA hybrids are higher than those of RNA/asDNA hybrids for most binary complex and ternary complex, especially E, Ec, Eb and En in evidence. The total energy of GGG/CCC hybrid is the lowest of all the hybrids, the more of number of G/C base pair, the lower of energy of the triplet hybrid and (G/C)3<U(G/C)2<A(G/C)2<U2(G/C) <AU(G/C)<A2(G/C) in turn; the U-including hybrid system stability, the more of number of uracil (U), the lower of energy of the triplet hybrid; the energy of GU-including hybrid is lower than that of CU-including hybrid no matter for binary complex or ternary complex. The formation of hydrogen bonds between neighboring base pairs often results to the distortion of base pair planes and affects the stabilities of triplets, which is often conducted by G/A/U bases. Moreover, some peculiar oligodeoxynucleotide sequence motifs that could be derived from corresponding triplet hybrids mentioned above are positively correlated with antisense activities, which are divided into two groups: four-member motif and five-member motif. The former comprises eight motifs, namely 5'-TCTT-3', 5'-TGCT-3', 5'-CCTT-3', 5'-CCAT-3', 5'-CATC-3', 5'-ATCT-3', 5'-GCTG-3' and 5'-GTCT-3'; and the later consists of four motifs, which are 5'-TGCTG-3', 5'-GTCTT-3', 5'-CCATC-3' and 5'-CATCT-3'. Matveeva et al [21] reported that CCAC, TCCC, ACTC, ATCC, GCCA and CTCT are motifs positively correlated with antisense activities, which is similar to our research finding. So designing antisense drugs is required not only to find a sequence with low bind energy, but also to avoid the motifs that can decrease the antisense activity.