To Predict Human Biomarker for the Obesity using Mouse Homologous Expression Data at Different Theiler Stages

. There are numerous genetic factors like MC4R (Melanocortin-4 receptor), POMC (Pro-opiomelanocortin), SIM1 (Single Minded Gene) etc. important in obesity, which can be used as biomarker. But more reliable diagnostic markers are the need for today, along with new therapeutic strategies that target specific molecules in the disease pathways. As in mouse and human genes, where mutations in one or both species are associated with some phenotypic characteristics as observed in human disease. In molecular mechanisms of development, differentiation, and disease gene expression data provide crucial insights. Up-regulation and down-regulation of selective genes can have major effects on diet-induced obesity, but there is little or no effect when animals are fed a low-fat diet. In present study we have studied the gene expression data of mouse at different theiler stages using GXD BioMart. The interacting partners and pathway of the genes that are already used as biomarker in mouse as well as in humans have been studied. A gene NPY1R (Neuropeptide Y1 receptor) was taken as common after STRING and KEGG results on the basis of biochemical pathways and interactions similar to MC4R. Our present work focuses on comparative genomics and proteomics analysis of NPY1R, which has led to identification of biomarker by comparing it with already known MC4R human and mouse biomarker. It has been concluded that both The result shows that both


INTRODUCTION
The prevalence of obesity seems to be leveling in the world and ongoing to be a big concern in the public health that is responsible for socio-economic costs of the 21 century [1].By an increase in body fat mass resulting from an imbalance between energy intake and expenditure characterizing complex nutritional disease i.e. obesity. Increase in obesity with aging of the population is likely to be accompanied by a rise in cardiovascular disease [24]. These conditions are interlinked and are associated with immediate and long-term health effects e.g. diabetes. Both genetic and eco-friendly factors are important in the expansion of obese diseases. Innovative and more reliable diagnostic markers of disease are needed, as well as new therapeutic strategies that target specific molecules in disease pathways [24]. Recognizing diagnosis and prognosis biomarkers from expression profiling data is of great significance for attaining personalized medicine and designing therapeutic strategy in complex diseases. However, the reproducibility of recognized biomarkers across tissues and experiments is still a challenge for this problem [2]. The mouse is the model organism usually used to study causes and treatments for obesity. Many genetic alterations identified in mouse models have a common genetic equivalence in humans as they are orthologs. Mouse and human genes are the ones where mutations in one or both species are associated with phenotypes characteristic of this disease. In molecular mechanisms of development, differentiation, and disease gene expression data provide crucial insights. Hasty changes in diet and lifestyle may hastily enhance the expression of harmful genes, which manifests in a sequence. There are tissue specific genes responsible for cellular differentiation and organogenesis. Up-regulation and down-regulation of selective genes can have major effects on diet-induced obesity, but there is little or no effect when animals are fed a low-fat diet [3]. Genomics and proteomics are a major issue for better understanding of normal function of the tissues and their interactions with the environment. This means developing functional genomics, characterizes the tissues in which the newly discovered genes are expressed and uses comparative genomics and proteomics to understand the development of tissues, ageing mechanisms, and signaling routes that enable the tissues to function.
Comparative genomics and proteomics is the field to determine the similarity, homology and other degree of relatedness between two or more gene products. It is conventional that hypothalamic and brain stem centers are involved in the regulation of food intake and energy balance but information on the related regulatory factors and their genes was scarce until the last decade [21]. NPY1R (Neuropeptide Y Receptor Y1), have been identified strongly expressed in variety of tissues, including trigeminal V ganglion, heart, brain, spleen, lungs, skeletal muscle, kidney and embryo in embryonic as well as in postnatal theiler stages is determined by RNA in situ and Northern blot. NPY theaters a significant role in stress, anxiety, obesity, and energy homeostasis via activation of NPY-Y1 receptors (Y1Rs) in the brain [22]. The NPY1R gene is the protein partner of genes that used as model in mouse as well as in humans.
By using different bioinformatics tools, the comparative analysis of NPY1R at gene as well as protein level can be analyzed for biomarker of obesity disease. Thus, the present study aims to predict the gene of obesity which could be taken as biomarker in human by comparing with the gene that already have been used as marker in mouse as well as in humans.

Material and Methods
Data retrieval. A wide range of tissues throughout the body and numerous types of mutations causes human genetic diseases. For human disease models have been made by mutating the same gene in mice that is responsible for the human condition and in most cases, these models replicate many of the corresponding human disease phenotypes. A list of all mouse and human genes where the mutation in one or both species is associated with phenotypic characteristic of these human diseases are retrieved from MGI (Mouse Genome Informatics) database ( Figure 1). MGI is the worldwide database resource for the laboratory mouse, providing assimilated genetic, genomic, and biological data to enable the study of human health and disease [4]. Mouse and human icons categorize the associated genes by the following conditions: • The disease is associated with both the mouse and human homologs of the same gene. Mouse genes may appear in this section based on disease models that express homologs of the mouse gene. • Mouse models involving mutations in the mouse gene or expressing homologs of the mouse gene are associated with this human disease. (OMIM and NCBI data currently do not associate this disease with the homologous human gene.) • Mutations in human gene are allied with this human disease phenotype. MGI may not contain evidence that a mutation in the mouse gene represents a model for this disease, if the mouse gene is homologous to a human gene associated with a disease; it is listed as a disease homolog. We studied three genes i.e. MC4R, POMC and SIM1which were known as humans and mouse biomarkers. But there are some which are known only as mouse biomarkers i.e. ASIP, ALMS1, AR, CPE, CRH, NCOA1, NPY1R, LEP, LEPR, and MC3Rfrom MGI database (Fig. 1). Expression data of genes at different theiler stages defines the development by a set of morphological criteria e.g. cell number, somite number etc. We have seen the mouse expression data at different theiler stages in different anatomical structures determined by experiments using GXD BioMart. The protein-protein interacting partners and pathway of already known mouse and human biomarkers have been seen using STRING [5] and KEGG [6].When searched for common partner of three genes i.e. MC4R, POMC and SIM1, NPY1R was one of the gene that was common among these three genes which is used as biomarker in mouse but not in humans and also having the same pathway i.e. Neuro-active ligand-receptor interaction similar to that of MC4R gene. A gene NPY1R

ILNS Volume 45
was taken for further analysis, which is interacting partner of already known humans and mouse biomarker i.e. MC4R and both have same pathway. Comparative analyses at both genetic and proteomic level. MC4R and NPY1R sequences were taken from NCBI [7]. At gene level comparison, we analyzed pairwise sequence alignment to see the similarity and identity between NPY1R and MC4R using LALIGN [8]. In addition, structural and functional similarity comparison at protein level was done. At protein level, first analysis of primary structure was done for physical and chemical properties of both proteins using ProtParam [9]. HSLPred was used for prediction of subcellular localization of both proteins [10]. The structure of MC4R and NPY1R were retrieved from PDB [11]. Analysis of similarity at secondary level was done using PHD [12]. For conserved domain comparison, SMART was used for identification and annotation of genetically mobile domains and analysis of domains architecture [13]. At tertiary level prediction, superimposition of MC4R and NPY1R structures was done to see weather both the structures are similar or not using chimera [14]. In addition, for pockets prediction and estimate the druggability of both proteins DoGSiteScorer was used [15]. For validation protein-protein docking International Letters of Natural Sciences Vol. 45 has been done using Patchdock which was used for searching common interacting partner of both proteins [16]. In this case LEP protein was one of the protein that has been choosen for proteinprotein docking. When searched for common interacting partner of MC4Rand NPY1R, LEP protein was common with evidence score 0.972 and also had crystalline structure in PDB.

RESULTS AND DISCUSSION
Sequence retrieval. The complete primary sequence of NPY1R with the Gene ID: 4886 and MC4R with the Gene ID: 4160 was retrieved from Genbank database at NCBI [7]. Comparative analysis at gene level. The pairwise sequence alignment of NPY1R and MC4R gene using LALIGN showed that the genes have the Waterman-Eggert score of 235.The bit score gives ansign of how good the alignment is i.e. higher the score, better the alignment and in this case bit score was 30.1. The expectation value (E-value) threshold is a statistical measure of the number of expected matches in a random database. The lower the e-value, the more likely the match is to be significant, this comparison shows the E-value as0.026 and when excess similarity/identity is observed, the simplest explanation for that excess is that the two sequences did not arise independently, they arose from a common ancestor here it is identity & similarity was 50.0%. Comparative analysis at protein level. NPY1R with the accession no.NP_000900.1 and MC4R with the accession no.NP_005903.2was retrieved from Genbank database at NCBI. The length of NPY1R protein was 384 and MC4R protein was 332 amino acids, respectively [20].
Comparison of primary structure. The physiochemical properties of proteins were studied by ProtParam tool. The NPY1R protein was predicted to have molecular weight of 44392.0 Da,whereas MC4R have 36942.8 Da. The theoretical isoelectric point (PI) of NPY1R and MC4R were 7.94 and 7.88, indicating that both the proteins are positively charged. The grand average of hydropathicityi.e. the GRAVY value of NPY1R and MC4R protein were 0.308 and 0.771 respectively, indicating that both the proteins are hydrophobic, have tendency that nonpolar substances aggregate in aqueous solution and exclude water molecules. The subcellular localization prediction using HSLPred server predicted that both the protein are plasma membrane proteins.
Comparison of secondary structure. The secondary structure prediction of NPY1R and MC4R proteins was done using PHD [10]. The results showed that both the proteins have approximately similar composition of helix, strand and coil which showed the similar secondary structures as shown in Table 1. Comparison of conserved domains. Domains reveals one of the utmost levels at which to understand protein function and the domain family-based analysis has a profound impact on the study of individual proteins [23] . A domain is more complex, and is usually defined as a modular functional unit folding independently. Classification of protein structural domains based on similarities of their structures and amino acid sequences. A incentive for this classification is to find the evolutionary relationship between proteins. The conserved domains were predicted using PFAM domains which showed that both the protein having the two conserved domains 7tm_1 and 7TM_GPCR_Srsx. Comparison of predicted domains, repeats, motifs and features of NPY1R are shown in Table 2. Proteins having the similar shape and nearly similarity of sequence and/or function are placed in "families", and are supposed to have a closer common ancestor.  [11].

ILNS Volume 45
Structural Superimposition. MatchMaker superimposes protein structures by first creating pairwise sequence alignments, then fitting the aligned residue pairs. In this we analyze the RMSD (Root Mean Square Deviation) value. RMSD values are usually used to measure the structural similarity between two optimally superposed protein three-dimensional structures. A very large value means that the two proteins are dissimilar, and zero means they are identical in conformation. In case of NPY1R and MC4R proteins, the RMSD value is 0.603 (Figure 2), it means that both proteins are similar at tertiary structural level. Active site prediction. Active site prediction is a crucial step of drug discovery. The 3-D structure of enzyme is analyzed to identify active-site and design drug which can fit into them. Predicting protein pocket's capability to bind drug-like molecules with high affinity, i.e., druggability, is crucial in the target identification phase of drug discovery. The pockets and descriptors have been calculated for 2F1U i.e. NPY1R protein and 2IQR i.e. MC4R protein with drug score. In case of 2F1U, the 1st pocket (P0) having the score is 0.81 (Figure 3, Figure 4). On the other hand the 2IQR also having the pocket (P0) with drug score is 0.8. The threshold value of druggability of pockets is 1, showing in druggable scale in Figure 5. The result shows that both proteins having the equal affinity to bind with drug-like molecules.   Ligand for protein-protein docking Common protein partner of NPY1R and MC4R. The common partner of NPY1R and MC4R was predicted using STRING [5]. It predicted that LEP protein is common protein which interacts with NPY1R as well as with MC4R ( Figure 6). Leptin secreted from adipose tissue binds to the leptin receptor in the hypothalamus. Leptin binding inhibits the neuropeptide Y/agouti-related protein (NPY/AgRP) production and stimulates pro-opiomelanocortin (POMC) production, which undergoes post-translational modifications to produce peptides such alpha and beta-melanocytestimulating hormone (α and βMSH) via the processing of prohormone convertase 1(PC1/3) and carboxypeptidase E (CPE) enzymes. Alpha and βMSH bind to melanocortin 3 and melanocortin 4 receptors (MC3R and MC4R) and induce their activity [17]. Protein protein docking. Protein-protein docking was performed using Patchdock (complex type-default and clustering RMSD-4.0) of NPY1R and MC4R with structure of monomer form of Leptin protein (1AX8),the role of which as part of a signaling pathway that acts to regulate the size of the body fat depot As the level of LEP increases may act directly or indirectly on the CNS to inhibit food intake and/or regulate energy expenditure as part of a homeostatic mechanism to keep constancy of the adipose mass. From top models for docking predicted by Patchdock, one of the model was choosen on the basis of score. In case of NPY1R, the maximam score was 13365, area= 1817.90 and ace= -450.41. In case of MC4R, the maximam score was 15360, area = 2175.80 and ace = -70.02 (Figure 7).

CONCLUSION
The prevalence of obesity continues to climb wide-reaching, making it imperious that animal models sharing characteristics of human obesity and its comorbidities be developed in the quest for innovative preventions and/or treatments [18]. In the past literature studies NPY1R was used as knockout marker in mouse for obesity but not use as biomarker in humans. The comparative genomics and proteomics analysis has led to identification of biomarker by comparing it with already known human and mouse biomarker. It has been concluded that both the proteins are plasma membrane proteins and also both NPY1R and MC4R belongs to same familyof GPCR (Gprotein-coupled receptors) [19] [20], NPY1R bears greater similarity to MC4R in its domain organization. And both proteins are structurally and functionally similar. Using the hypothesis we concluded that NPY1R protein could be used as potential biomarker in humans for obesity.