Comparative Analysis of UGT1A9 Genetic Polymorphisms between Chinese Han and Tibetan Popul ations

Although various polymorphisms have been identified in the UGT1A9 gene in the main sorts of ethnic groups in the world, no investigati ons have been focused on the Chinese Tibetan populations and the comparison between Chinese Tibe tan and Han populations to date. This study was designed to systematically compare genetic differen ces between the two populations. We investigated the functional regions of UGT1A9 in 200 unrelated healthy Chinese volunteers, compr ising 100 Tibetan and 100 Han individuals from Qinghai and Sh aanxi, respectively, by using direct sequencing. A total of 21 different genetic variants, including 7 novel variants were identified. According to the results of comparative analysis, the allele frequen cies of three common variants (-1888T>G, 95246T>C, 96292C>T) were significantly different be tw en Tibetan and Han populations (p ≤0.05). However, there were no differences of linkage diseq uilibrium patterns, haplotype structures and htSNPs between the two populations. UGT1A9*1b was the prevalent defective alleles in Chinese population. In addition, -1888T>G, -1819T>C, -441C> T affected the binding of transcriptional factors and four of the missense mutations (P361L, N397H, P 448L and Y483D) were highly conserved among the three different species (Homo sapiens, Rattus n orvegicus and Danio rerio). In short, there are significant differences of genetic information of UGT1A9 between Chinese Tibetan and Han populations. The determined genetic information of UGT1A9 in Chinese Tibetan and Han populations might serve as a baseline for larger studies on pha rmacogenomics and also provide important data for the advance of personalized medicine in Chinese Han and Tibetan populations.


INTRODUCTION
The human UDP-Glucuronosyl Transferases (UGTs) is one of phase II drug-metabolizing enzymes that catalyze the glucuronidation of a variety of endogenous and other exogenous compounds, including drugs, carcinogens and other xenobiotics (Fujita et al., 2006). According to the primary amino acid sequence homology of UGT, they were categorized into two major families, UGT1 and UGT2 (Chouinard et al., 2006). The entire UGT1 family is derived from a single gene locus (UGT1A), located on chromosome 2 (2q37), spanning about 210 kb, coding for nine functional proteins (UGT1A1, UGT1A3-UGT1A10) and three pseudogenes (Gong et al., 2001). The UGT2 family can be further divided into two subfamilies, UGT2A and UGT2B, which are encoded by different genes clustered on chromosome 4q13-4q21.1 (Turgeon et al., 2000). The UGT1A isoforms share the identical exon 1, while differing in exons 2-5, both of which are responsible for the specific expression of UGT1A isoforms (Araki et al., 2005). The regulation and expression of the UGT1A gene are tissue-specific, eg. UGT1A1, UGT1A3, UGT1A4, UGT1A6 and UGT1A9 (Strassburg et al., 1997;1998; are hepatic tissuespecific, whereas UGT1A7, UGT1A8 and UGT1A10 are exclusively expressed in extrahepatic tissues such as mouth, esophagus, intestine, pancreas and colon (Ockenga et al., 2003;Strassburg et al., 1997;Vogel et al., 2002).
The UGT1A9 gene, a member of the UGT1 family, is both involved in the metabolism of endogenous estrogens and thyroid hormones, the exogenous chemicals and drugs such as phenol, acetaminophen, propofol, propranolol, (Albert et al., 1999). The UGT1A9 gene is expressed specifically in liver, kidney, colon and esophagus (Albert et al., 1999;Tukey and Strassburg, 2000). To date, various polymorphisms have been identified in the UGT1A9 gene, resulting in decrease of UGT1A9 activities toward a variety of exogenous substances and affecting efficiencies of drug metabolisms. -118(dT) 9>10 (UGT1A9*1b), for example, is one of the most important polymorphisms in the promoter region of the UGT1A9 gene reported to be associated with the enhanced transcriptional activity of this gene (Yamanaka et al., 2004). This variant was reported (Carlini et al., 2005) to cause higher incidence of toxicity and poor tumor response. UGT1A9 polymorphisms influence SN-38G formation in liver microsomes (Han et al., 2006) and UGT1A9*1b genotypes might be important for SN-38 glucuronidation. In this regard, polymorphisms of UGT1A9 gene may exert a significant impact on the pharmacokinetics and toxicity of drugs.
China consists of 56 ethnic groups, among which Han accounts for 90.56% of the total Chinese population, yet Tibetan is a major minority. Up to now, no systematic investigations have been focused on polymorphisms, Linkage Disequilibrium (LD) pattern and haplotype structures of UGT1A9 in Tibetan population. Similarly, the genetic information comparison between Tibetan and Han is also rare. The genetic information may be different due to ethnic and spatial differences (Mehlotra et al., 2007). Polymorphisms, genotypes and haplotypes may collectively provide more effective aids for individualized treatment. In this study, in order to study and compare the identified genotypes, allele frequencies, LD pattern, haplotype structures and haplotype tagSNPs (htSNPs) of UGT1A9 in the Tibetan and Han groups, a comprehensive study on the genetic information in the Tibetan and Han populations was conducted, The determined genetic information of UGT1A9 in Chinese Tibetan and Han populations might serve as a baseline for larger studies on pharmacogenomics and also provide important data for the advance of personalized medicine in Chinese Tibetan and Han populations.

MATERIALS AND METHODS
Study populations: Two hundred healthy unrelated Chinese people from two different regions of the Chinese mainland were recruited in the study. Two hundred Chinese can be divided into two groups, one hundred Han volunteers from Yulin, Shaanxi province and one hundred Tibetan volunteers from Qinghai province. Each group includes 50 males and 50 females aging from 18-40 years. All participants provided their detailed information, so we can guarantee that the people in each group were of the same origin. All volunteers provided written consent for the use of their peripheral blood samples for experimental purposes and the present study was reviewed and approved by the ethics committee of Northwest University.
Polymerase Chain Reaction (PCR) and DNA sequencing: Systematic polymorphism screening was performed using PCR and direct sequencing. A 2ml sample of venous blood was collected from each subject. Genomic DNA was extracted from peripheral blood leukocytes of 200 subjects by the standard procedure (Fujita et al., 2006). The extracted DNA was dissolved in sterile distilled water and stored at-80°C until PCR analysis. The promoter regions, all exons and 3'UTR of the UGT1A9 gene were amplified and directly sequenced using an ABI PRISM 3700 DNA analyzer (Applied Biosystems, Foster City, California, USA) using eight sets of primers. The obtained sequences were examined for the presence of variants using Sequencher software (version 4.10.1, Gene Codes Corporation, Ann Arbor, Michigan, USA). The A in the ATG translation start codon is denoted nucleotide +1. The sequence of the complete UGT1A9 gene described in GeneBank (Gene ID: 54600) was used as a reference.
Statistics analysis: Allele and genotype frequencies were calculated by the counting method. The χ 2 test or Fisher's exact test were used to compare allele, genotype frequencies between the Tibetan and Han populations. Statistical significance was set at p<0.05. All the statistical works were implemented on the SPSS 16.0 platform. Haploview, based on the expectationmaximization method, was used to measure LD between each of two loci and to estimate the Lewontin's coefficients D' (Lewontin, 1988) and correlation coefficient r 2 (Hill and Robertson, 1968). r 2 of 0.8 was selected as a threshold for all analysis. The block structures and their haplotype frequencies were also estimated using Haploview version 3.2 (Stephens et al., 2001). htSNPs were selected using the Haploview version of the Tagger program.
Functional predictions: Polymorphisms in the promoter region may have an influence on Transcriptional Factors (TFs) binding to the specific sites, including sorts and amounts. Web-based TFSEARCH software was used to analyze the transcriptional factor binding to the promoter region. The normal and variant sequences were analyzed by the software, respectively. According to the analysis, the influence of the polymorphisms towards to TFs was speculated. As for non-synonymous variants, conservative assessment was performed on the web-based Protein BLAST software (http://blast.ncbi.nlm.nih.gov/) in three species which were Homo sapiens (h), Rattus norvegicus (r) and Danio rerio (d).
LD/Haplotyping analysis: LD in pairwise SNPs was calculated for the two ethnic groups through Haploview analysis by calculating D' and correlation of alleles at two loci (r 2 ). The results were shown in Fig. 1. The results indicated that the degrees of linkage disequilibrium seem to be no different in the two populations.
The haplotype frequencies and htSNPs determination in the two ethnic groups analyzed by Haploview were summarized in Table 4.    Every ethnic group has only one LD block and the haplotype structures were analyzed by using Haploview (Fig. 1). In the LD block, the two groups had similar haplotype structures. Haplotype CCC was the dominating haplotype with the frequencies of 85.0%, 85.5%, respectively, in Tibetan and Han populations. Haplotype TGG had the frequencies of 12.5%, 13%, respectively, in Tibetan and Han populations.

Functional predictions:
In the promoter region, the TFSEARCH software analysis showed that some polymorphisms altered transcriptional factor binding efficiency.
For-1888T (binding C/EBP transcriptional factor), if the variant occurred, there was no related transcriptional factor in binding to the site. For -1819T (binding Oct-1 and CdxA transcriptional factors), if the variant occurred, transcriptional factor transformed into Nkx-2. For -441C (binding CP2), if the variant occurred, there was no related transcriptional factor binding to the site (Fig. 2). Towards the four nonsynonymous variants in the coding region, BLAST software showed that the four missense mutations (P361L, N397H, P448L, Y483D) occurred in highly conserved among three different species (Homo sapiens, Rattus norvegicus and Danio rerio) as shown in Fig. 2b.

DISCUSSION
Although many polymorphisms of UGT1A9 have been identified in multi-ethnic groups, there are no relative investigations among the Chinese Tibetan population and comparison with Han population to date. This study conducted a comprehensive analysis of UGT1A9 polymorphisms between the two ethnic groups of China. In this study, of all the 21 genetic variants, the allele frequencies of three common variant (-1888T>G, 95246T>C, 96292C>T) were significantly different between Tibetan and Han population (p≤0.05). Moreover, we also detected some ethnic-specific variants. For instance, 588G>T, 94987T>C, 94990A>C, 96399A>C (N397H), 100375C>T (P448L), 101044T>C were detected only in Tibetan population, while -40C>G, 95186G>A, 100479T>G (Y483D), 100813T>G only in Han population. Despite that three nonsynonymous variants [8G>A (C3Y), 98T>C (M33T) and 766G>A (D256N)] and a variant at 726T>G resulting in a premature termination codon TAG (Y242X) have been reported in exon1 in other ethnic groups (http://www.pharmacogenomics.pha.ulaval.ca/sgc/ugt_a lleles), we have not detected these variants in this study. Taken together, there are ethnic differences in polymorphic and prevailing mutations of UGT1A9.
We analyzed the LD pattern of the two ethnic groups separately. Strong LD was observed among 100836T>C, 100964G>C, 101065G>C in the two populations. The results indicated that the degrees of linkage disequilibrium seem to be no different in the two populations.
Owing to the different htSNPs selection, the different haplotypes structures were made and with a different haplotypes distributions in different populations. The combined effects of some decreasedfunction variants will lead to inactive enzymes. Different polymorphisms and their combinations may generate markedly different results with respect to UGT1A9 activity. Thus, according to htSNPs detection and haplotype analysis the metabolizer phenotype would be identified easily (Chen et al., 2008).
There existed no significant differences in LD pattern, haplotypes and htSNPs between the two populations. Since these parameters could fluctuate owing to small sample sizes, it should increase the number of the study samples to get precise information. It is known that TFs, regulate gene expression by identifying and combining gene promoter cis-regulatory elements. Web-based TFSEARCH (version1.3) was used to analyze the TFs binding. Software analysis showed -1888T>G, -1819T>C and -441C>T affected the transcriptional factor binding, leading to conversions from C/EBP binding site to none, from Oct-1, CdxA to Nkx-2 and from CP2 to none (Fig. 2). Further investigations are needed to confirm the impact on the gene expression. Apart from the known functional variants UGT1A9*1b, *1d, *1f, we conducted the conservative assessment of other variants by the web-based Protein BLAST software. Four of the missense mutations (P361L, N397H, P448L and Y483D) occurred in highly conserved among three different species. The result indicates that these mutations may give rise to significant effects on the function of UGT1A9. In order to determine the effects, further functional researches should be done.
Glucuronidation, catalyzed by UDP-Glucuronosyl Transferases (UGTs), is one of the critical steps in the detoxification and elimination of various endogenous and exogenous compounds. Some of the polymorphisms of UGTs isoforms are known to affect glucuronidation rates (Radominska et al., 1999;Tukey and Strassburg, 2001). Numerous studies of functional characterizations of UGT1A9 polymorphisms which may be associated with altered metabolism/pharmacokinetics of certain drugs have been conducted. The SN-38, the active antitumor metabolite of the irinotecan (a main therapeutic drug for the treatment of metastatic colorectal cancer patients), is detoxified by UGT1A isoforms. Previous studies have shown that UGT1A9*1b with impaired enzyme function have the major effect on the SN-38 detoxification (Gagné et al., 2002;Cecchin et al., 2009). UGT1A9 enzyme variant M33T, heterologously expressed in HEK293 cells, showed 1.7-fold reduced intrinsic clearance for mycophenolic acid 7-O-glucuronide (Bernard and Guillemette, 2004) and 26-fold reduced intrinsic clearance for SN-38 glucuronide (Villeneuve et al., 2003). Another UGT1A9 enzyme variant D256N, showed 22-fold reduced intrinsic clearance for SN-38 glucuronide . Besides, -118(dT) 9 , one of the UGT1A9 polymorphisms in the promoter region increased tumor response. -118(dT) 9/9 genotype was significantly associated with efficacious tumor response when compared with all other genotypes and also associated with a lower incidence of toxicity, whereas the -118(dT) 10/10 genotype predicted for poor tumor response (Carlini et al., 2005). Thus, these data have important public health implications. Clinical testing for the UGT1A9 polymorphisms should be implemented as predictors of toxicity/effectiveness of some drugs. In this regard, systematically studying the polymorphisms of UGT1A9 might play an important role in the prediction of toxicity and responsiveness to cytotoxic agents, as described for other detoxifying enzymes (Martino et al., 2011).

CONCLUSION
In conclusion, our results showed that there are ethnic differences in polymorphic and prevailing mutations of UGT1A9 in Chinese Tibetan and Han populations. The predicted potential variants 96292C>T (P361L), 96399A>C (N397H), 100375C>T (P448L), 100479T>G (Y483D) may be functional variants. Our data may serve as a baseline for large samples studies on pharmacogenomics and also provide important data for the advance of personalized medicine in Chinese Han and Tibetan populations.