A systematic review and integrative approach to decode the common molecular link between levodopa response and Parkinson’s disease

Background PD is a progressive neurodegenerative disorder commonly treated by levodopa. The findings from genetic studies on adverse effects (ADRs) and levodopa efficacy are mostly inconclusive. Here, we aim to identify predictive genetic biomarkers for levodopa response (LR) and determine common molecular link with disease susceptibility. A systematic review for LR was conducted for ADR, and drug efficacy, independently. All included articles were assessed for methodological quality on 14 parameters. GWAS of PD were also reviewed. Protein-protein interaction (PPI) analysis using STRING and functional enrichment using WebGestalt was performed to explore the common link between LR and PD. Results From 37 candidate studies on levodopa toxicity, 18 genes were found associated, of which, CAn STR 13, 14 (DRD2) was most significantly associated with dyskinesia, followed by rs1801133 (MTHFR) with hyper-homocysteinemia, and rs474559 (HOMER1) with hallucination. Similarly, 8 studies on efficacy resulted in 4 genes in which rs28363170, rs3836790 (SLC6A3) and rs4680 (COMT), were significant. To establish the molecular connection between LR with PD, we identified 35 genes significantly associated with PD. With 19 proteins associated with LR and 35 with PD, two independent PPI networks were constructed. Among the 67 nodes (263 edges) in LR, and 62 nodes (190 edges) in PD pathophysiology, UBC, SNCA, FYN, SRC, CAMK2A, and SLC6A3 were identified as common potential candidates. Conclusion Our study revealed the genetically significant polymorphism concerning the ADRs and levodopa efficacy. The six common genes may be used as predictive markers for therapy optimization and as putative drug target candidates. Electronic supplementary material The online version of this article (10.1186/s12920-017-0291-0) contains supplementary material, which is available to authorized users.


Background
Parkinson's disease (PD) is a second most common progressive neurodegenerative disorder followed by Alzheimer's disease [1]. It affects 1.5% of the global population over the age of 65 years [2]. Characterised by motor symptoms, like gait dysfunctioning, bradykinesia, rigidity, and resting tremors, PD has been believed to be caused due to loss of dopamine at the dopaminergic neurons in the substantia nigra pars compacta [3]. Along with the dopaminergic disruption, other non-motor dysfunctioning like depression, sleep disorder, dementia are also observed in PD patients which can be a plausible consequence of both dopaminergic and non-dopaminergic systems. Pathological confirmation is obtained by the presence of Lewy bodies-fibrillar aggregates, mostly consisting of protein alpha synuclein, in the affected neurons of the brain [4].
Levodopa (or L-Dopa), ever since its discovery, has been used as a potent anti-Parkinson's medication and functions as symptoms alleviating therapy, by maintaining the dopamine concentration at the synapse and reduce the motor fluctuations observed in PD patients [5]. Almost 15-20% of the patients do not respond to the therapy or show adverse profiles primarily, levodopainduced dyskinesia [6] after 5 years of therapy. Managing ADR is thus one of the most challenging aspects of PD. Carriers of specific genetic polymorphisms of drug metabolising enzymes, drug transporters, drug receptors and proteins involved in drug pathway of anti-Parkinson's drugs may predispose to adverse reactions or altered efficacy.
Several susceptibility loci have been studied already with the familial cases of PD, like SNCA (PARK1), LRRK2, PRKN (PARK2), PINK1 (PARK6), DJ-1 (PARK7) [7]. However very less has been elucidated about the genetic background of the sporadic cases of PD. Neurodegenerative diseases including PD are multifactorial in nature. Mechanisms like mitochondrial dysfunction, Lewy body formation, oxidative stress, altered protein handling, and inflammatory change are considered to lead to cell dysfunction and death by apoptosis or autophagy. Ageing is one of the most studied risk factor for PD, and the biochemical changes that are a consequence of aging amplify these abnormalities in PD patients' brain [8]. Candidate studies have pin-pointed genes like NAT2, MAOB, GST, mitochondrial tRNA, S18Y variant of UCHL1, SNCA, MAPT H1 haplotype and LRRK2 [9]. GWA studies have identified more risk loci: BST1, GAK, HLA-DR, ACMSD, STK39, MCCC1/ LAMP3, SYT11, PARK16, FGF20, and GPNMB, but with lower significance to establish a valid association for clinical management [10]. Also, since the mechanism of development and progression of PD have not been elucidated fully, current treatment options are only targeted at providing symptomatic respite. Understanding of these multiple aspects of PD may potentially reward this field of study for clinical intervention.
The aim of the present article is to summarize all the studies carried out on polymorphism-association of administration of levodopa on sporadic PD patients and its treatment outcome as ADR and the altered efficacy of the drug. We, also describe the interplay of the molecular pathways involved in the mechanism of levodopa induced ADRs, LR and the disease pathology. This is an attempt to identify the molecular targets as genes and if the polymorphisms in such genes predispose certain patient population susceptible to causing ADRs and altered efficacy. For this purpose, we perform a systematic review through several online databases, select the relevant articles on the basis of pre-defined inclusion and exclusion criteria based on the focus of our study, separately, for LR, and PD disease susceptibility. These articles are further assessed for their methodological quality and finally the data was extracted for the list of genes (and its variants) associated with the drug response and disease risk. This effort has been further elaborated using computational approaches like network modelling to rule out the systematic biases from high-throughput multiple datasets and identify if there is any molecular mechanism involved in LR and PD susceptibility that intersect each other. Such proteins can be plausible targets to minimize toxicity, elaborate the therapeutic efficacy and capture disease risk.

Methods
All the methodologies performed in the study were drawn following the Human Genome Epidemiology Network for the systematic review of genetic association studies [11][12][13][14] and the PRISMA guidelines [15].

Data source and search strategy
A systematic search in Medline [16] and Web of science [17] was performed using standard MeSH terms "Parkinsons's disease", "variant", "Polymorphism", "SNP", "single nucleotide polymorphism", "pharmacogenomics", "response" with AND/OR Boolean operators to identify all the human studies on genetics of Parkinson's disease and/ or on drug response by anti-Parkinson drugs. Also, a check for the studies that were not identified by the previous search, of pharmacogenetic relevance from the PharmGKB database [18], were added using search term "Parkinson's disease". The searches were limited to human studies.

Study selection criteria
The study selection was carried out independently in two stages by two different authors (DG and MKM) from relevant articles published up to March 9, 2016. All the articles that are reviews, commentary, erratums, editorials, technical reports, news, evaluation studies were initially screened (n = 173). Articles that were in duplicate (n = 260) and published in languages other than English (n = 36) were also excluded. At first, the articles were screened by titles based on relevance, as obtained by the search. Secondly, the abstracts of all primarily screened articles were retrieved and assessed according to the inclusion and exclusion criteria provided in Additional file 1: s4. Further, the articles were distinctly segregated into ADR of levodopa and efficacy of the drug. Only full text articles were included in the final study corpus. In case of disagreement regarding the screening of the articles, an independent reviewer (PT) was consulted to resolve the discrepancies. The crossreferences of the finally selected articles were also searched for additional relevant articles. Further, the MeSH term for the adverse effect with 'levodopa' or 'L-Dopa' were again retrieved to double-check for any missing articles of purpose. All the baseline univariate significant allelic/genotypic associations with ADR/s and with L-dopa efficacy are reported in the Additional file 2: Table S1(8a) and Additional file 3: Table S2(8b), respectively. Similar search was performed for the drug response related articles as well.
Data extraction and quality assessment Data were extracted by DG and MKM and checked by PT and RK. In case of sequential or multiple publications from the same group of authors, only the recent article has been included or studies which report exclusive findings. Data extracted from each eligible publication is provided in Table 1 (complete in Additional file 2: Table S1(8a)) for ADR articles and Table 2 (complete in Additional file 3: Table S2(8b) for drug efficacy studies. Ethnicity was classified as African, Asian or Caucasian [19][20][21][22][23]. If the ethnicity was not reported, the source population based on the country in which the study was conducted was considered e.g. Chicago. The genetic associations were stratified by ethnicity/population to explore the inter-ethnic variations. All the populations of the subjects have been categorised into its respective super-populations based on the 1000genome project (Phase II).
On a systematic analysis of the GWAS, conducted so far on PD susceptibility risk, included twenty GWAS studies (consisting of 45,465 cases and 173,222 controls). Sixty-one loci have been identified as significantly associated with the disease risk (p ≤ 0.01 × 10 −8 ) (Table 3). A detailed summary of these significant genetic variants obtained from GWAS in the field of PD has been represented in Additional file 4: Table S1. The genes BST1,  CCDC62/HIP1R, DGKQ/GAK, GBA, ITGA8, LRRK2,  MAPT, MCCC1/LAMP3, PARK16, SNCA, STK39, and SYT11/RAB25 are disease risk loci following the collaborative meta-analyses. SNCA (p ≤ 4.16 × 10 −73 ) and MAPT (p ≤ 2.37 × 10 −48 ) have been studied in 9 and 7 GWA studies establishing the functional relevance in the disease physiology, making them the most prominent loci. Followed by LRRK2 and GAK in 4 studies, GBA/ SYT11 and MCCC1/ LAMP3 in 3 studies respectively.
Two reviewers (DG and MKM) independently assessed the methodological quality of all the selected articles using a predefined set of criteria. All included studies were assessed for the quality of data presented by using modified criteria suggested by Wells K. et al. [24]. The quality assessment was scored on 14 parameters (Additional file 1: s5 and s6), with a positive score awarded for each detail present in study, the lack of detail was described as either NA (not applicable) or NR (not reported). NA was assessed with an equal positive score and was given only when the study was deemed independent of the parameter; NR was equated to no scoring and was independently awarded by the authors if the methodology was found insufficient or unreported. The detailed list of the 14 parameters used for the quality assessment has been discussed in Additional file 1: s7. Conflicting scores were reached to a consensus upon discussing (RK and PT). If the score was obtained as 11 or higher, the study was ranked as high quality.

Protein-protein interaction network
To decipher the connecting molecular link between the roles of the genes studied for LR and disease risk would help us rule out the bias if any to identify the interacting proteins of drug response thus elucidate the genetic landscape of the disease. Two independent proteinprotein interaction (PPI) networks were constructed using STRING application with minimum required interaction score of 0.7 (high confidence), active interaction sources were set for only known interactions (databases and experiments) and maximum number of interactors to show in the 1st shell-no more than 50, and no 2nd shell interaction [25]. Pathway enrichment analysis was conducted by WebGestalt [26]. Using pathway commons enrichment analysis, default GO slim classification, 0.001 significance level, and minimum no. of genes in a category was set at 5. The enrichment analysis was run adjusting the false discovery rate (FDR) using the Benjamini-Hochberg (BH) procedure to obtain the results independently for the two set of genes.

Search and study selection
The workflow of the search and study selection has been represented in Fig. 1. A total of 1041 articles were obtained, of which 469 studies were excluded that includes duplicates, reviews, and articles in other languages (n = 469). The remaining 572 articles were screened by title and abstract following the inclusion and exclusion criteria resulting in further exclusion of 498 articles: 189 studies discussing other diseases or comorbid conditions, 16 studies on familial PD were removed as current study focuses on sporadic form of the disease, 76 studies were based on animal or in vitro models for PD, 40 studies were not on genetic association and 166 others did not discuss any drug response, finally 11 papers on drugs other than levodopa were also excluded to narrow down the scope of current study to the most widely prescribed medication. A total of 74 eligible publications were further divided into studies that are on adverse effect of levodopa or the efficacy of the drug in the patient cohort. In case of unavailable full text articles, the authors were contacted       (n = 22). Responses received (n = 16) were included in the study, rest excluded (n = 6). From the crossreferences of the included studies, six additional articles fulfilled the inclusion criteria [19,[27][28][29][30][31]. Thus, finally 38 eligible publications on levodopa induced ADR and 8 on drug response had sufficient data available for extraction to carry forward the systematic review.

Study characteristics
The methodological and demographic characteristics of the ADR studies and the drug efficacy related studies of levodopa are summarised in Table 1 and Table 2 [29,[32][33][34]. Apart from original research articles, four letters [19,21,35,36], two brief reports [37,38], and two short communications [33,39] were included as they had sufficient data pertaining to our criteria of inclusion and exclusion, one randomized control trial [40] was included. All the patients recruited in the independent studies were primarily administered with levodopa alone (n = 21), dopamine agonist (n = 4), DDC inhibitor/ carbidopa (n = 4), COMT inhibitor/ entacapone/tolcapone (n = 5) or MAO-B inhibitor (n = 1). The dose of levodopa administered ranged between 200.00 to 805.14 mg/day. The range of follow up period for the recruited subjects in the studies was between 6 to 10.3 months. The PD subjects recruited were diagnosed by UK Brain Bank Criteria (UK BBC) [41] in 26 articles, by Gelb's criteria [42] in 1 article, by CAPIT [43] (Core assessment programme for intra-cerebral transplantations) in 1 article or by an experienced neurologist in 2 articles. The different motor functioning assessment scales used are provided in Additional file 2: Table S1(8a). The variability in HY assessment scale of 3.5 or more have been used. Out of thirty-seven studies, thirteen focussed on levodopa induced dyskinesia exclusively, three on other motor fluctuation exclusively, one on wearing on/off, three on hyper-homocysteinemia, five studies were on hallucination, one study each on COMT inhibitor induced toxicity and elevated liver transaminase levels, and twelve studies discussed multiple ADR in the same cohort of recruited patients. Motor fluctuations were observed to be the most common adverse effect of anti-Parkinson's medications. Dyskinesia (subjects of tardive dyskinesia, peak dose dyskinesia, diaphasic dyskinesia were grouped together) being the most prevalent among the subjects with ADRs, was present in 45.72% of patients. This group had an early age at onset of motor symptoms, longer disease duration and 560.96 ± 321.97 mg/day levodopa daily mean doses. In addition to levodopa, around onethird of the total patients were administered with COMT or MAO-B inhibitor or DDC inhibitor like carbidopa, entacapone or tolcapone. In addition to dyskinesia, adverse effects like other motor complications (motor impulsivity, wearing on-off, chorea, dystonia) were observed in 35.74% of total ADR subjects, and hyper-homocysteinemia 2.62%. Hallucinations occur as a consequence of psychosis, hence the two have been synonymously used and their subjects have been summed up together constituting 12.38%. One paper each discussed about COMT inhibitor induced toxicity (0.26%) and elevated transaminase level (3.27%).

Methodological quality
The cumulative quality assessment score obtained by individual ADR studies are represented in Additional file 2: Table S1(8a) and that of drug efficacy in Additional file 3: Table S2(8b). In ADR studies, the mean methodological assessment score was calculated to be 10.56 (SD 2.15), range 7 to 13. On the modified scale, twenty of thirty seven articles were deemed as good quality with a cut off score of ≥11, thirteen articles scoring ≥9-10 were categorised under moderate quality and finally any scores below 9 were judged as poor quality which included four articles. In L-dopa efficacy studies, the mean methodological score was 11.13 (SD 1.86), ranging 7 to 13. Six articles qualified to be good, and one each as moderate and poor quality.

Genetic factors in other LR
On elaborate systematic extraction of published literature on LR, in terms of efficacy, of the drug, only eight studies deemed our defined inclusion and exclusion criteria. The enzymes directly involved in the metabolism and activity of levodopa is evidently been mostly studied with the altered LR. rs4680 (COMT) [48], rs6280 (DRD3) [51], rs921451, rs3837091 (DDC) [52], rs28363170, rs3836790 (SLC6A3) [54] were the significant variants with reduced LR (Additional file 3: Table S2(8b)). However, no conclusive results could be drawn from this systematic analysis due to large variability and low significance.

Genetics of PD susceptibility
Employing GWAS dataset to stratify disease susceptibility loci, we follow an unbiased approach to identify such loci in sporadic PD cases. Nalls et al. (2014) [69] recently conducted a large scale meta-analysis to identify the associated loci with disease risk. Keeping this study as the base of the systematic review of all the GWAS on PD risk, and adding the recent studies to it. Twenty studies were included for the systematic review with 45,465 cases and 173,222 controls, mostly from including Caucasian population followed by Jewish, Chinese and Japanese. Sixty one loci in genes like BST1, CCDC62/HIP1R, TMEM175/DGKQ/GAK, GBA, ITGA8, LRRK2, MAPT, MCCC1/LAMP3, PARK16, SNCA, STK39, and SYT11/ RAB25 were associated with the disease susceptibility as shown in Table 3. Additional file 4: Table S1 tabulates all the loci found to be associated with disease susceptibility and the significant single nucleotide polymorphism (SNPs) in bold (p value ≤1.0 × 10 −8 ).

Protein-protein interaction network
We performed PPI analysis using genes obtained from the systematic review of LR and disease risk in order to understand the functional association among the genes in the respective gene modules. With 19 proteins associated with LR and 35 with PD, two independent PPI networks were constructed respectively using STRING database to identify critical candidate genes/proteins (Fig. 2). In LR, the 67 nodes, represent the genes, and 263 edges weight the likelihood of nodes in common biological functions.) In PD susceptibility, 62 nodes linked with 190 edges are depicted. Functional enrichment analysis using WebGestalt identified three common pathways (Alpha synuclein signalling, ADPribosylation factor 6 (Arf6) downstream pathway, and Insulin-like growth factor 1 (IGF1) pathway) in the top 20 pathways for LR and PD (Fig. 3). The common six

Discussion
PD is a progressive brain disease which causes significant movement disability [70]. The treatment is aimed at symptomatic management rather than complete cure. However, the challenge is large clinical variability in drug response and adverse effects on prolonged therapy. Discerning the genetic factors responsible for this variability to the drug toxicity and efficacy can provide better clinical management. This study identifies such genetic variants in genes involved in L-dopa metabolism and in the disease etiology by a systematic review approach. Further from the limited number of genes obtained from the systematic review, we extended our effort to integrated computational approaches like network modelling and functional enrichment to identify the other interacting proteins and thereby distinguish the common proteins and molecular pathways that participate in LR and the disease. We additionally show the limitations of the published literature and give insights that may be useful to future studies.
We have implemented a modified scale of Wells K et al. (2009) [24] criteria with five additional parameters, to assess the quality of articles included in the systematic review. To the best of our knowledge our study has incorporated the most comprehensive methodological quality assessment scoring for screening articles of systematic review. Candidate gene studies have been screened for the systematic review of LR.
A total of 18 genes from the 37 ADR studies and 4 genes from the 8 efficacy studies were retrieved after the systematic review. Most of the genes are related to dopaminergic pathway and their role have been depicted in Fig. 4. Most of the studies included the genes related to dopaminergic pathway. For instance, among ADR studies, CA n STR 13, 14 (DRD2) was found to be most significantly associated with dyskinesia, rs1801133 (MTHFR) with hyper-homocysteinemia, and rs474559 (HOMER1) with hallucination. Carriers of 13, 14 alleles are found to have lower risk of developing dyskinesia but role of this repeat is still unknown. Patients with the TT677 (rs1801133, 677C > T) genotype exhibit 50% reduced activity of MTHFR enzyme, consequently elevating the plasma homocysteine levels [32]. rs474559 G allele (HOMER1) have lower prevalence of dyskinesia as it might disrupt the glutamatergic transmission [60]. In efficacy related studies, rs28363170, rs3836790 (SLC6A3) and rs4680 (COMT), were important. Individuals with rs3836790 6/6 or rs28363170 10/10 (SLC6A3) genotypes have higher transporter expression leading to lower dopamine levels at the synapse [71]. The haplotype structure formed by four SNPs (rs6269: A > G, rs4633: C > T, rs4818: C > G, rs4680:A > G) characterises the COMT enzyme activity to low (ACCG), medium (ATCA) and high (GCGG) [72,73]. Accordingly, the levodopa metabolism is affected, altering the synaptic dopamine concentration. Also we observed that SLC6A3, COMT, and DRD3 genes were common between the ADR and efficacy studies resulting in 19 exclusive genes from LR studies.
In addition, to identify the disease genes involved in drug response and vice versa, genes found implicated in PD susceptibility from GWAS were also retrieved. This led to obtaining 61 significantly associated SNPs (p value ≤1.0 × 10 −8 ) pertaining to 35 genes (Additional file 4: Table S1). Then an integrated network analysis resulted in six common molecular targets (SNCA, FYN, SRC, UBC, CAMK2A, and SLC6A3) from the overlap between 67 nodes (and 263 edges) in LR and 62 nodes (and 190 edges) in PD pathophysiology, respectively. Among the six common molecular targets, SNCA has been widely established to be a major player in PD susceptibility as it a major component of Lewy bodies and mutant SNCA has a greater tendency to acquire misfolding [70,71]. Aggregation of SNCA has been shown to be neurotoxic for the cell through the formation of intermediate aggregates called protofibrils [74]. In a recent report the distinct role of alpha-synuclein forming fibrils as the major toxic, resulting in progressive motor impairment and cell death leading to neurotoxic phenotypes in PD is demonstrated [74]. FYN, a tyrosine kinase family protein, found inside nerve cells and helps in communicating signals or chemical instructions between different cellular components. This protein has been observed to get modified on levodopa administration, causing dyskinesia [75]. Wang et al. (2016) validated neuro-inflammation inhibition by SRC (SRC proto-oncogene, non-receptor tyrosine kinase) signalling pathway to be a potential drug and disease candidate which supports our finding SRC as the common molecular bridge to both drug response and disease pathology [76]. UBC belongs to the Ubiquitin family C which carries out the ubiquitin mediated proteolysis and aggregate ubiquitin monomers in the diseased brain. The ubiquitin proteins cause aberrations in the ubiquitin proteasome system (UPS) leading to PD pathogenesis [77]. The neuronal protein, CAMK2A alters with intracellular calcium ion concentration change that is abnormally activated following dopamine depletion thus modulating the neuronal function in striatum [78]. Zhang et al. (2014) also established an interaction between CaMK2A and Dopamine D2 receptors in striatal neurons, sensitive to long-term levodopa administration to PD rats [79]. Finally, the SLC6A3/DAT1 variants have a significant effect on striatal activation and performance in PD as suggested by Habak et al. These results have furnished evidences on the role of these candidates in both levodopa metabolism impairment and disease risk. Further, these plausible biomarkers might bridge the path between levodopa metabolism and disease pathology resulting in reduced ADRs, optimum efficacy and, accurate diagnosis.
Functional enrichment analysis revealed prominently [74], alpha-synuclein pathway to be the most significant candidate with the set of disease and response related genes respectively followed by other growth factor signalling pathways like Afr6 downstream pathway, IGF1 pathway, and so on. Arf6, ADP-ribosylation factor, signalling plays a role in the Ras-mediated cell signalling [80]. It is also responsible in the intracellular trafficking of DRD2 by GRK and PKC proteins [81]. The potential role of IGF1, Insulin-like growth factor 1, signalling has been studied with neurodegeneration in human, participating in functions like brain neuron survival, synaptic transmission as well as plasticity [82]. Bernhard FP. et al. (2016) [83] also established that IGF1 might serve as a PD prediction marker, observing elevated levels of IGF1 in PD patients. Additional file 5: s2 and Additional file 6: s3 tabulate the enriched functions obtained by levodopa response genes and PD related genes. We highlighted the potential usefulness of these biological functions in PD treatment which can be affirmed by in vitro and/or in vivo model systems.
Although significant findings have been observed in our study, several limitations exist. The papers included in the systematic review presented high heterogeneity in terms of diagnosis, response criteria, drugs administered with different doses and genotyping techniques. As suggested by Schumacher-Schuh et al. (2014) [84] the phenotypic heterogeneity in terms of adverse effects lacks clinical instrument to adequately measure the ADR, whereas in terms of efficacy, several response rating scales have been incorporated. In GWAS studies, the assayed SNPs are usually to mark a genome region that influences the studied phenotype. However, we have picked up the annotated genes corresponding to the significantly associated SNPs from the respective studies, to identify the proteins that play a role in the biological processes which ultimately influences the phenotype. Motor fluctuation, a common ADR of levodopa, lacks clear clinical classification and hence assessment. A regular record of patient motor state could be preferred. Genetic heterogeneity is another source of variability between studies because different markers in the same genes were employed for these associations; moreover, patients with different genetic backgrounds may not be strictly comparable. One major limitation of network biology is the quality and the coverage of the interactions. The rate of discovery of false positives and false negatives are high which shows the need to rank the reported interactions for further validation.

Conclusion
In summary, the present study provides a framework for better understanding of the molecular interplay between L-Dopa metabolism with PD pathophysiology and also a means to evaluate putative biomarkers to bridge the gap in treatment outcome and disease risk. We propose the above six genes could be useful in predicting both the LR and disease risk, simultaneously. This however warrants further experimental validations to develop into a targeted therapy. Translating these evidences into future validation would present pre-diagnostic marker development which can be applicable in clinical manifestation. A definitive role of these molecular targets in the disease progression can also lead to substantive advancement in PD treatment.

Funding
Financial support from Council of Scientific and Industrial Research (CSIR) and Indian Council of Medical Research (ICMR) is duly acknowledged. Financial support from CSIR-funded projects -GENCODE-A (BSC0123) and GOMED (MLP1601) are duly acknowledged to carry out this study. DG acknowledges CSIR projects (BSC0123 and MLP1601), MKM acknowledges MLP0901, PT acknowledges CSIR (Research Associateship) and CR acknowledges UGC, Govt. of India for providing fellowship.

Availability of data and materials
The data supporting the results of this research paper are included within this article and its additional supplementary files.