Prevalence of miRNAs in Introns and Cis-Regulatory Regions of Genes of the Somatotropic Axis in Mammals

Corresponding Author: Tatiana Shkurat Department of Genetics, Southern Federal University, Rostov-on-Don, Russia E-mail: tshkurat@yandex.ru Abstract: To understand the role of microRNA in genetic control of animal growth mechanisms, we performed bioinformatic analysis of the localization of microRNAs in introns and cis-regulatory regions of the somatotropic axis genes-GH1, GHRH, SST and IGF1 in Homo sapiens, Macaca mulatta, Pan troglodytes, Pongo abelii, Gorilla gorilla, Ovis aries, Bos Taurus, Loxodonta Africana, Oryctolagus cuniculus, Canis lupus, Rattus norvegicus and Mus musculus. The results showed a significant difference in copy number of investigated microRNAs in the somatotropic axis genes surroundings in all primates. Copies of the mir-566, mir-1273, mir-1268, hsa-mir-5096 and hsamir-3929 mature sequences were frequently observed in cis-regulatory regions of these genes. The greatest number of motifs in cis-regulatory regions and introns of the investigated genes was detected for the hsa-mir5096 and hsa-mir-1268. We assume that the observed microRNAs may play an important role in the formation of morphological and physiological traits such as weight and height in mammals.


Introduction
In the last decade full genomes of a number of organisms have been sequenced. The genome sequencing showed that the more complex the organism is the longer and more diverse non-coding gene sequence it has. It was shown that more than 80% of non-coding DNA is involved into a variety of biochemical and regulatory processes in cells (Ecker et al., 2012). Despite these important findings we still poorly understand the mechanisms of genetic control of the mammals weight and size. Indeed, the weight of the smallest mammal Bumblebee bat is 2 g, while the largest one, Balaenoptera musculus, weighs 150 tons. The body size of mammals may differ 75-million-fold This complicated regulation is orchestrated by the hormones of somatotropin axis (Oldham et al., 2000).
On the other hand, one of the important regulators of transcriptional and posttranscriptional gene expression is miRNAs usually accumulated in introns and intergenic spaces of the genome. miRNAs are transcribed together with the target gene and regulate its expression They are highly variable and universal. It is known that one miRNA may control up to one hundred genes. The number of combinations is increasing continuously and forms delicate hierarchical system (Bartel, 2004;Chandra et al., 2010). The feedback connection, i.e., programming by the miRNA genes, is also known. This allows us to solve a series of problems even today; for example stem cell reprogramming (Qian et al., 2010). There is an increasing interest to the study of interactions of miRNAs with the mechanisms of genetic regulation of animal growth. In the present study we performed bioinformatic analysis of the miRNA distribution in introns and gene surroundings of GH1, GHRH, SST and IGF1 in different mammalian species.
Full sequences were obtained from the NCBI database (http://www.ncbi.nlm.nih.gov/) by using the set of scripts IFITCH designed for automatic data obtainment from NCBI. miRNA sequences were taken from the miRBase Web site (http://mirbase.org/). At the time we were doing this study miRNA database contained 28645 entries representing hairpin precursor miRNAs, expressing 35828 mature miRNA products, in 223 species.
First of all, we carried out visual analysis using the "Dotplot" program package designed in our laboratory (Shkurat et al., 2013). Visually identified multiple sequences were analyzed with "Mscanner" program designed for miRNA typing. The resulting database was filtered by using a number of SQL queries. The results of the initial GLAM2 search were filtered to yield only those matches in which there were 85% identical nucleotides for microRNA-related hairpins and for the pre-miRNA molecule.

Results
The Analysis of Correlation between the Number of Nucleotides in the Introns and the Growth Hormone Gene Surroundings and some Features of Animal Ontogenesis We performed the analysis of correlation between the number of nucleotides in the non-coding regions of the GH1, GHRH, SST and IGF1 gene surroundings and morphological and physiological traits of animals, such as adult weight, weight at birth, weight at weaning, sexual maturity, gestation, litter size, postnatal growth rate (from Gompertz function), maximum longevity residual and litters per year.
Data on the number of nucleotides in the growth hormone gene surroundings are shown in Table 1. We analyzed the genomes of animal species mentioned above, which contained the intron and intergenic spaces before and after the studied gene.
It was shown that the number of nucleotides that surrounds the somatotropin gene varied from 10 to 25 thousand base pairs in different mammals. The IGF1 gene, the product of which triggers the effects of the growth hormone, is surrounded by 140-680 thousand nucleotides.
Noncoding region and intergenic space of the GH1 gene in Gorilla gorilla had the shortest nucleotide sequence and the same gene in Bos Taurus had the longest nucleotide sequence. The number of introns in the GH1 gene correlated positively with the number of litters per year in mammals (r = 0.89±0.00002).
Noncoding region and intergenic space of the IGF1 gene in Mus musculus had the shortest nucleotide sequence (136714 b.p.) and the same gene in Loxodonta Africana had the longest nucleotide sequence (678338 b. p.).
The number of nucleotides in the IGF1 gene surroundings correlated positively with the mass of the newborn and mature animal r = 0.67±0.001 and r = 0.75±0.015 respectively) and with the age of sexual maturity onset (r = 0.72±0.00007). The number of nucleotides in the IGF1 gene surroundings correlated negatively with the postnatal growth rate (from Gompertz function) (r = 0.78±0.003).
For other genes no correlations were found.

GLAM2-Analysis of pre-miRNA Sequences in Cis-Regulatory Regions of the Somatotropic Axis Genes in Mammals
Mi-RNA homologous to 85% or more to the 72 nucleotide sequences of the pre-miRNAs from the miRBase database detected in introns and non-coding DNA in cis-regulatory regions of the somatotropic axis genes are listed in Table 2.
We analyzed the localization of pre-miRNAs in noncoding DNA (30 thousand base pairs upstream to the beginning of the gene and 30 thousand pairs of nucleotides downstream from the end of the gene) in 12 mammals. We have found miRNA sequences only in primates and rodents (Table 2 and 3).
The greatest number of copies of miRNAs is localized in introns and intergenic space of the releasing hormone gene (GHRH), which specifically stimulates the secretion of growth hormone (Mallo et al., 1993). This phenomenon is typical for both primates and rodents. The smallest number of miRNAs copies is identified in the SST gene surroundings.
In all studied primates we found copies of the hsamir-566 gene. Previously it was shown that motifs of the hsa-mir-566 or hsa-mir-619 hairpin homologous sequences in primates are localized in introns. The number of hairpins copies of these miRNAs was more than 280000 in the human genome and more than 290000-in chimpanzee. Moreover the percentage of hsamir-566 occurrence in the total number of all types of miRNAs located within introns of human genes was 45.34% (Hill and Sorscher, 2013).
The data on the distribution of detected miRNAs copies in introns are presented in Table 4. Hsa-mir-566 and its homologues are not found in the introns of the GH1 gene. However, this miRNA is presented in the introns of the IGF1 gene in all primates, in particular in Macaca mulatta-1 copy, Gorilla gorilla-2 copies, Pongo abelii-5 copies, Pan troglodytes-6 copies, Homo sapiens-6 copies. IGF1 gene has 4 introns in all primates except Macaca mulatta, where it has 5 introns.
pre-miRNA hsa-mir-5096 was not found in GH1 and SST introns, but it was detected from 1 to 4 copies in the introns of GHRH and IGF1 genes.      We observed only one miRNA copy in the intron of the GH1 gene in gorillas (ppy-mir-1273a).
IGF1 is the most important endocrine mediator of the somatotropic hormone activity. IGF1 also provides the feedback with hypothalamus and pituitary by somatotropic axis, i.e., the secretion of somatotropinreleasing hormone and growth hormone depends on IGF1 level in blood. The secretion of somatotropinreleasing hormone and growth hormone increases under low IGF1 level in blood and on the contrary-reduces under its high level.
Currently there is no explanation for the high conservatism of the hsa-mir-566 or hsa-mir-619 hairpin sequences which are widely spread in primates genomes. However, it was shown that the hairpins of all miRNAs are much more highly conserved than DNA sequences in their surroundings. This finding supports the idea of evolutionarily preservation of the hairpin, which binds with related motifs (Hill and Sorscher, 2013).
Further in our work we investigated the prevalence of miRNAs hairpins in non-coding region and the surroundings of the genes.

Analysis of hsa-mir-5096 and hsa-mir-1268-Related Hairpins in Somatotropic Axis Genes Surroundings in Mammals
Using GLAM2 analysis in the entire set of the noncoding DNA sequences around GH1, GHRH, SST and IGF1 genes in 12 mammals we found 714 motifs which were more than 85% homologous to miRNA with a number of different motifs 78. All motifs, homologous to miRNAs hairpins, were detected only in primates. The greatest number of motifs was detected for the hsa-mir-5096 and hsa-mir-1268 genes hairpins (101 and 87 copies respectively). The number of other miRNAs copies varied from 8 to 13.
In addition to the high copy number hsa-mir-5096 and hsa-mir-1268 motifs, we also observed a high degree of homology, higher than 0.9 (Table 5). Figure 1 shows the occurrence frequency of a motif homologous to hsa-mir-5096 in the somatotropic axis genes surroundings in primates. The greatest number of the motif copies was observed in the GHRH gene surroundings in every examined primate: 21 copies were detected in chimpanzees, orangutans and humans. The smallest number of copies (3) was observed in the SST gene surroundings in Macaca mulatta. The number of the motif copies in the IGF1 gene surroundings was almost the same (about 7) in all studied primates. Among all studied genes, the largest number of the hsa-mir-5096 motif copies, in particular-13, was found before GHRH gene. The distribution of the motif copies number between different areas of the IGF1 gene was equal in all studied primates, what is associated with a high degree of homology of the sequences in the IGF1 gene surroundings in these animals. Five of the seven copies are presented in the introns of the IGF1 gene. This motive was not observed in the region after IGF1 gene.
The copy number of the hsa-mir-5096 motif and their distribution around the GH1 gene is almost the same in all primates. The motif copies were not found after the GH1 gene only in gorillas. More precise analysis shows that such redistribution is associated with a significant expansion of the transcribed region of the GH1 gene in the forward direction in gorillas in comparison with other primates.
The prevalence of the second most frequent motif homologous to hsa-mir-1268 in the somatotropic axis genes surroundings in primates is shown in Fig. 2.  The greatest number of the motif copies was detected in the GHRH gene surroundings and the same was observed for every studied primate. In gorilla GH1 gene surroundings the motif was presented in a single copy. The motif copy number was higher in other primates. The greatest copy number (18) of this motif was revealed in the GHRH gene surroundings in Macaca mulatta. The hsa-mir-1268 motive was not found in the introns of the GH1 and SST genes in all primates except gorillas. Among all genes the greatest copy number (12) occurs before the GHRH gene. The distribution of the motif copy number between different regions of the IGF1 gene was equal in all studied primates, what is associated with high degree of homology of the IGF1 gene sequences in these animals. And commonly these motif copies were located in the introns of this gene.

Discussion
Discoveries of the past decade showed that RNA works not only as a functional messenger between DNA and protein, but also participates in regulation and organization of genome, gene expression and its role rises with increasing complexity of the organism (Hill et al., 2005). The important role of RNA was shown in epigenetic processes which control the differentiation and development. These findings indicate that RNA seems to play a central role in human evolution and ontogenesis (Olovnikov, 2007;2009).
MiRNAs being functional participants in the gene networks regulation inhibit the expression of the miRNAs targets. There is a growing interest in the involvement of miRNAs in important cellular functions and biological processes and identification of the relationship between miRNAs and their targets (Gregory et al., 2008;Jones et al., 2014).
Earlier it was shown that miRNAs participate in the formation of species morphological diversity (Heimberg et al., 2008), as well as the miRNAs accumulation in tissues correlates with the morphological changes rates (Wheeler et al., 2009). The complexity of organisms rises with the increasing proportion of transcribed but not translated regions of the genome, with increasing length and number of introns in protein-coding genes, with the increasing number of regulatory elements required for the each gene expression regulation, with the increasing number of the transcription factors binding sites in the genome (Chinwalla et al., 2002;Levine and Tjian, 2003). The number of introns also increases with phylogenetic complexity of organisms (Hill et al., 2005). MiRNAs are important phylogenetic markers because of their strikingly low rate of evolution (Wheeler et al., 2009Ivashchenko et al., 2014. Most probably that is why mir-599, mir-1273, mir-1268, hsa-mir-5096, we found only in primates. It can be phylogenetic markers. On the other hand the GH protein, encoded by the GH1 gene, plays an important role in human development, growth and metabolism. It is known that growth hormone is a powerful regulator of postnatal growth development and its level affects the final size of the body. Our results showed a significant difference in the copy number of the investigated miRNAs in the intergenic space of GH1 gene and its receptor (GHRH) for all primates. We assume that the discovered miRNAs play an important role in the formation of morphological and physiological traits such as weight and height in mammals.

Conclusion
Our results showed that miRNAs-mir-566, mir-1273, mir-1268, hsa-mir-5096 and hsa-mir-3929 play an important role in the formation of morphological and physiological characteristics, such as weight and height in mammals.