This article has Open Peer Review reports available.
Integrated analysis of DNAmethylation and gene expression using highdimensional penalized regression: a cohort study on bone mineral density in postmenopausal women
 Tonje G. Lien^{1}Email authorView ORCID ID profile,
 Ørnulf Borgan^{1},
 Sjur Reppe^{2, 3},
 Kaare Gautvik^{3, 4} and
 Ingrid Kristine Glad^{1}
https://doi.org/10.1186/s1292001803412
© The Author(s) 2018
Received: 7 May 2017
Accepted: 21 February 2018
Published: 7 March 2018
Abstract
Background
Using highdimensional penalized regression we studied genomewide DNAmethylation in bone biopsies of 80 postmenopausal women in relation to their bone mineral density (BMD). The women showed BMD varying from severely osteoporotic to normal. Global gene expression data from the same individuals was available, and since DNAmethylation often affects gene expression, the overall aim of this paper was to include both of these omics data sets into an integrated analysis.
Methods
The classical penalized regression uses one penalty, but we incorporated individual penalties for each of the DNAmethylation sites. These individual penalties were guided by the strength of association between DNAmethylations and gene transcript levels. DNAmethylations that were highly associated to one or more transcripts got lower penalties and were therefore favored compared to DNAmethylations showing less association to expression. Because of the complex pathways and interactions among genes, we investigated both the association between DNAmethylations and their corresponding cis gene, as well as the association between DNAmethylations and translocated genes. Two integrating penalized methods were used: first, an adaptive groupregularized ridge regression, and secondly, variable selection was performed through a modified version of the weighted lasso.
Results
When information from gene expressions was integrated, predictive performance was considerably improved, in terms of predictive mean square error, compared to classical penalized regression without data integration. We found a 14.7% improvement in the ridge regression case and a 17% improvement for the lasso case. Our version of the weighted lasso with data integration found a list of 22 interesting methylation sites. Several corresponded to genes that are known to be important in bone formation. Using BMD as response and these 22 methylation sites as covariates, least square regression analyses resulted in R^{2}=0.726, comparable to an average R^{2}=0.438 for 10000 randomly selected groups of DNAmethylations with group size 22.
Conclusions
Two recent types of penalized regression methods were adapted to integrate DNAmethylation and their association to gene expression in the analysis of bone mineral density. In both cases predictions clearly benefit from including the additional information on gene expressions.
Keywords
Background
Poor bone health and low bone mineral density (BMD) lead to reduction in bone mechanical strength. BMD is commonly used in clinical practice as a surrogate measure of bone strength. Osteoporosis (OP) is diagnosed when the bone mineral density is more than 2.5 standard deviations below that of a young adult (30–40 years old), healthy women reference population [1]. In the Western world, more than 40% of women over 50 years experience OP and low energy fractures, and the highest rates are found in the Scandinavian countries [2].
The number of publications describing DNAmethylation in bone cells is limited, but the understanding of its importance is rapidly increasing. DNAmethylation is an important factor in the development and function of bone cells, and therefore also in skeletal disease characteristics [3, 4]. In this paper we explore the impact of genomewide DNAmethylations on BMD further.
In the present study of 80 postmenopausal women (50–84 years), we have measured both global DNAmethylation and global RNA transcripts obtained from the same bone biopsies. Combining this information can give us additional strength in the functional analysis of DNAmethylation. The relationship between DNAmethylation and gene expression is complex [5–7]. While increased methylation in promoter regions tends to reduce transcription, probably by inhibition of transcription factor binding, increased methylation within the transcribed part of DNA often increases transcription, probably due to reduced use of spurious intergenic promoters [8]. Also, binding of transcription factors to a promoter region may promote or inhibit DNAmethylation depending on the properties of that factor. In addition, cellular functions are largely influenced by networks and complex pathways involving several genes, also on other chromosomes, which again have both positive and negative influence.
This paper aims to identify genomewide DNAmethylation sites explaining the variation in BMD in postmenopausal women, while integrating the association between each CpG site and gene expression using a common set of bone biopsies. We assume that methylations strongly associated with transcripts are most relevant in regulation of bone mineral density. This assumption is based on results from several recent studies: We earlier studied bone DNA methylations in the 100 genes representing transcripts most significantly associated with BMD [9]. These DNA methylations were significantly correlated to several of the 100 transcripts. We have also showed, that methylation of the bone SOST promoter is associated with reduced level of its protein sclerostin, an important bone anabolic inhibitor [10]. Furthermore, Del Real et al. [11] performed global DNA methylation and transcription analysis of human mesenchymal stem cells (hMSCs), the precursors of osteoblasts, from femoral heads of women undergoing hip replacement due to fractures using controls with hip osteoarthritis. The authors identified differentially methylated loci situated in genomic regions with enhancer activity, being associated with differentially expressed genes/transcript levels enriched in pathways related to hMSC growth and osteoblast differentiation.
We fit a multivariate regression model using BMD as response and genomewide DNAmethylations as covariates. The number of covariates is much greater than the number of individuals, so ordinary linear regression does not give a unique solution, and it follows that penalized regression methods, also known as shrinkage or regularization methods, are needed. We include gene expression data in the analysis, by allowing for individual penalties in the regression, and let the gene transcript levels guide these penalties. Both cis (DNAmethylation and transcript from the same gene) and trans (DNAmethylation and transcript from different genes) associations are investigated. We point out that we study associations, and that such effects may be causal, but do not necessary need to be. Only in the case when a change in one variable causes a corresponding change in the other variable, can the effect be called causal.
One of the most used penalized regression methods is ridge regression [12]. This method does not assume the quite strong assumption of sparsity, in the sense that only a fairly small number of covariates has an effect on the response variable. It follows that ridge regression includes all covariates in the final model. A ridge regression method which allows for individual penalties is the adaptive groupregularized ridge regression [13]. This is a generic method, which allows for ranked penalties for groups of covariates. We used this method by dividing the methylation sites into groups based on their association to gene transcript levels. We show that this method improves the predictive ability compared to the classical ridge regression in which the external information from gene expressions is not included.
Typically, two aspects are important when fitting a regression model. First a model should be able to predict future data, secondly it should be interpretable [14]. A large number of covariates may complicate the interpretation, especially when the number of covariates p is larger than the number of individuals n. By doing dimension reduction we can get a shortlist of the most important variables. Using this approach, assuming sparsity, we can use the highly popular lasso method [15]. For our dataset it does not give as good prediction as ridge regression, but the resulting short list of genes is easier to interpret. Within this lasso framework, the weighted lasso with data integration allows for individual penalties based on external information [16]. We extend this method further to suit our particular data types. Using this modified method, we improve predictive performance, compared to the classical lasso, where gene expression is not integrated.
This paper is organized as follows: In the method section we first present an overview of the data, next in the section “Quantifying the strength of association between DNAmethylation and gene expression” we present how we test for a significant association between each DNAmethylation and the gene transcript levels, and how we find the pvalues adjusted for multiple testing. The adjusted pvalues are our measure of association between DNAmethylation and gene expression. In the section “Penalized regression using penalty multipliers”, the essential concept of individual penalties in penalized regression is presented. In our penalized regression, with bone mineral density as response, each covariate (hence DNAmethylation) gets an individual penalty. The relationship between these individual penalties and the adjusted pvalues are described in the section “Mapping the penalty multipliers to the adjusted pvalues”. Lastly, in the sections “Adaptive groupregularized ridge regression” and “Extensions to the weighted lasso with data integration”, we present the details in the generic adaptive groupregularized ridge regression method [13], and the extensions to weighted lasso with data integration to fit to our integration setting. The functional enrichment analysis is described in the section “Functional enrichment analysis”. Section “Results” presents the results from the two methods and their consensus, and lastly we discuss in the section “Discussion”, the methodology and the biological findings in this study.
Methods
Cohort description
Postmenopausal ethnic Norwegian women were consecutively recruited at Lovisenberg Diakonale Hospital, the Outpatient Clinic, in Oslo. The survey was performed in 2004–2010. The women showed BMD varying from severely osteoporotic to normal and did not have other diseases or medication known to affect the skeleton. Extensive clinical and biochemical information on this cohort are available in [17].
Quantifying the strength of association between DNAmethylation and gene expression
When investigating the association between each DNAmethylation and gene transcripts, we look at two scenarios, first each DNAmethylation against all possible gene transcripts in the analysis, and secondly each DNAmethylation against only the cis related genes. In both cases, we quantify the strength of an association by using the utility test called the “global test” [20]. The null hypothesis is that there’s no association between gene expressions and DNAmethylation, and the alternative hypothesis is that one or more gene transcript levels are associated with the DNAmethylation level. This test is valid for both highdimensional data (p>n) as well as the case where there are more individuals than covariates. The association between DNAmethylations and gene expressions can be both positive and negative. See Additional file 2 for more details. We performed tests for each DNAmethylation, and got a corresponding pvalue p_{ j } for j=1,…,p. Since a large number of tests is performed, we adjust for multiple testing by controlling the false discovery rate (FDR) [21]. We use the adjusted pvalue, called the qvalue q_{ j }, as a measure of the strength of the association between the particular DNAmethylation and the gene expressions. Moreover, we interpret low values of q_{ j }, as a strong association.
Penalized regression using penalty multipliers
where l_{ j }>0 is a penalty multiplier and will be defined in the next section. Note that it is possible, and sometimes desirable, to have equal λ_{ j } within a group of covariates. The optimization of the weighted penalty regression can, after some scaling of the data, be rewritten as the optimization of the classical penalty regression without multiple penalty terms. For more details, see Additional file 2: Additional materials.
Mapping the penalty multipliers to the adjusted pvalues

be a monotone transformation of the qvalues,

favor the covariates to a degree determined by the data itself,

result in centered l_{ j }, with mean equal to 1, to ease the interpretation,

let all l_{ j }=1 if the external data does not carry any information with predictive value, meaning that the external data is excluded and we end up with the classical penalized methods.
The next two sections explain in detail the construction of such a function in the two cases, adaptive groupregularized ridge regression [13], and our extended version of the weighted lasso with data integration.
Adaptive groupregularized ridge regression
The Bayesian formulation of ridge regression considers a prior for the β_{ j }’s where they are independent and Gaussian distributed with mean zero and variance τ^{2}. Then the penalty parameter λ is proportional to the inverse of this variance. In the adaptive groupregularized ridge regression, first presented in [13], the covariates are divided into groups, and group penalties are introduced, such that all coefficients belonging to a group g have the same Gaussian prior with variance \(\tau _{g}^{2}\). In our case, the covariates were grouped based on the mentioned qvalues from the “global test”, such that the covariates with the lowest adjusted pvalues, the favored covariates, will be in group 1 and so on. We chose the group sizes, such that they reflect the interpretation of the qvalues [13]. In total, there were 100 groups, where the first group had size 10 (the most relevant covariates), and the following groups had increasing group size, as shown in Additional file 3: Figure S6. The last group (with the highest qvalues) was the largest group.
The variance for each group g, \( \tau ^{2}_{g}\), is estimated by an empirical Bayes approach. The groupspecific penalty multiplier for the covariates within group g is then \(l_{g} = K / \hat { \tau }^{2}_{g}\), where K calibrates such that the mean of the inverse penalty multipliers were equal to 1. To enforce monotonicity, the penalty multipliers were calibrated using weighted isotonic regression [26], such that l_{1}≤…≤l_{ G }. After including the penalty multipliers and estimating the resulting updated \(\hat {\boldsymbol {\beta }}\), the procedure was iterated by calculating new empirical Bayes estimates, until the crossvalidated sum of squares was not improving. This method is implemented in the R package GRridge [13].
Extensions to the weighted lasso with data integration
The weighted lasso with data integration in [16], was presented with a penalty on the form λ_{ j }=λl_{ j }=λh(η_{ j })=λη_{ j }^{ α }, where η_{ j } was some measure of importance calculated from additional data, and where α≥0 controlled the range of the penalty multipliers. The larger the α the more difference between favored covariates and unfavored ones. In [16], the optimal combination of λ and α was found using a 2dimensional 10fold cross validation (CV). This procedure could be time consuming, and the two parameters are influenced by each other and potentially unstable. Compared to our approach they did not consider qvalues as an input.
where \(q_{j}^{\ast } = 2q_{j}  1\), the qvalues linearly transformed to the interval from 1 to 1, and K is the normalization constant. We used the R package glmnet [27], which has penalty weights as argument and performs the calculations as explained in Additional file 2: Additional meterials.
Functional enrichment analysis
To analyze our results from the lasso, we used the Core Expression Analysis function in the Ingenuity Pathway Analysis software (IPA, QIAGEN Redwood City, www.qiagen.com/ingenuity) to identify overrepresented pathways, upstream regulators or diseases in our dataset. The available molecules and relationships in the IPA Knowledge Base for mammal (humans, mouse or rat) were considered, filtering to relationships experimentally observed, directly or indirectly. Pvalues were calculated using the standard righttailed Fisher exact test within IPA as part of the analysis.
Results
Results from the multivariate regression while integrating the associations between DNAmethylation and global gene expression
We quantify the relationship between global bone gene expression and global bone DNAmethylation by looking at two different scenarios. First we considered the association between each DNAmethylation and all transcripts. Next, we considered the association between each DNA methylation and only their cis transcripts, as described in the section “DNAmethylation and cis transcripts”. For the first scenario, the distribution of raw pvalues from the “global test” is shown in Additional file 4: Figure S7. There are 3830 DNAmethylations significantly associated with global transcripts at FDR =1%. Using these multiple test corrected pvalues, the qvalues, we fitted the adaptive groupregularized ridge regression, to get an initial impression of the importance of including the external data. To find a shortlist of the most important variables, we then fitted our version of the weighted lasso with data integration. We validated the methods using predictive mean square error (pMSE), by leave one out CV, which in turn leaves one sample out when training the models, and then runs the fitted model on each test sample left out.
Adaptive groupregularized ridge regression
The adaptive groupregularized ridge regression allows for one penalty per group of covariates. The DNAmethylations were therefore divided into 100 groups based on the qvalues, representing their strength of association with the gene transcripts. Using these groups, we found the resulting group penalty multipliers (see section “Adaptive groupregularized ridge regression”), and the updated \(\hat {\boldsymbol {\beta }}\). The method was then repeated, and converged after three iterations, where the current \(\hat {\boldsymbol {\beta }}\) did not improve in terms of cross validated sum of squares.
After validating the performance, we saw an improvement in pMSE from 2.070 (classical ridge regression) to 1.766 (adaptive groupregularized ridge regression), which was a 14.7% improvement. This means that the strength of association between each DNAmethylation and gene transcripts gave important information.
The extended weighted lasso with data integration
Next, we fitted the extended weighted lasso with data integration and obtained a shortlist of the most important DNAmethylations, which both explained BMD and were correlated to gene expression.
We calculated the optimal λ from the classical lasso, using 10fold cross validation. Next the penalty multipliers l_{ j }, described in Eq. (3), were calculated for a grid of α values from 0 to 50 in steps of 5. For each α the 10fold cross validated sum of squares error was determined. The optimal α, which gave the lowest cross validated error, became 10. The resulting number of variables selected where 22. In the validation step, our version of the weighted lasso gave then a pMSR =2.184, which is an improvement compared to the classical lasso which gave a pMSR =2.630. This corresponded to a 17% improvement.
Consensus in results from ridge regression and lasso
The 22 DNAmethylation sites and the calculated effect on BMD
In our weighted lasso with data integration, which performed variable selection, 22 DNAmethylations were selected. The 22 selected DNAmethylations corresponded to 22 different genes. In Additional file 5: Table S1 we see those selected DNAmethylations, the corresponding regression coefficient, and information about the localization of the CpGs. The degree of variation in BMD explained by these 22 DNAmethylations, was calculated by an ordinary least squares regression, using BMD as response and all the 22 DNAmethylations as covariates, resulting in a R^{2}=0.726. Such a large number of DNAmethylations can possibly explain a large proportion of BMD just by chance, so we compared it to the R^{2} for 10000 randomly selected groups of DNAmethylations with group size 22. On average, the random groups had R^{2}=0.438, clearly lower than the 22 DNAmethylations selected by the lasso regression. None of the random groups had R^{2} as large as 0.726.
Additional file 6: Table S2 shows that these DNAmethylations are involved in different biological functions. In this list are transcription factors (ZNF529, NFIA, TFEC, SPI1, MKL1), which have general and probably unique roles in most cells. They represent a link to other genes on the list involved in metabolism (Aldehyde dehydrogenase) that are already strongly implicated as highly associated with BMD [17]. In addition, Destrin, Protocadherin 9 and Dysferlin are known to be important for osteoblasts and/or osteocytes, which are bone anabolic or stress sensing cells, respectively [28–31]. Maybe most notable, 7 genes are specially linked to osteoblast functions and BMD, or generate a bone phenotype when aberrantly expressed (ADAMTS2, COL11A2, TFEC, DOCK5, PTPN11, ANXA2, DSTN). The results from the Ingenuity Pathway Analysis in Additional file 7: Table S3, show that several general functions and pathways might be affected by DNAmethylation in these 22 genes, which are discussed in the last part of the next section.
DNAmethylation and cis transcripts
As mentioned earlier, we also looked at the association between each DNAmethylation site and their cis gene expression. Additional file 8: Figure S8 shows the resulting pvalues from the “global test” when testing each methylation against only their cis related transcripts. There were 981 methylations with significant association at a FDR=1%. Using the adjusted pvalues from these type of “global tests”, we saw little or no improvement in the predictive performance. For the adaptive groupregularized ridge regression we got a pMSE =2.04, which was a small improvement compared to the classical Ridge with pMSE=2.07. When fitting our weighted lasso with data integration, the optimal α became 0 such that all l_{ j }=1, which means that the external data did not improve the prediction.
Discussion
Summary of the statistical approach
The present paper studies the association between bone mineral density (BMD) and global DNAmethylation using penalized regression. Since DNAmethylation may affect gene expression, transcripts for the same individuals were integrated into the analysis by use of individual penalties for each DNAmethylation. The DNAmethylations which were strongly associated to global gene expression, got a lower penalty than the DNAmethylations with weaker associations to gene expression. The degree to which these associations lead to differentiated penalties, was determined by the data. The predicted mean square error was thereafter considerably improved.
Discussing the statistical significance of the 22 CpGs found
In our list of 22 selected CpGs, the most familiar pathway belonging to bone metabolism (e.g. Wnt signaling and TGF β signaling) did not come up, and this may be due to the variable selection procedure in lasso. If covariates are strongly correlated, e.g. variables existing in networks, lasso tends to select only one out of several highly correlated variables [14], and the variable selection can therefore result in different short lists [32–34]. Our list of 22 DNAmethylations is the most important one, but there may also be other lists with bordering predictive value.
We investigated the possible difference between cis associations versus global association, and only saw improvements in the predictions after integrating information from global associations. When cis related genes were considered, no (or a small) improvement in the predictions was detected. The lack of improved predictive performance in the cis analysis was somewhat surprising, and may be related to the complex networks behind the machinery of gene expression. Thus, genes further down in the pathways could be more correlated, and therefore the transeffects will be more visible when using penalty weights. This indicates that strong associations between DNAmethylations and genes at other positions in the genome, could be of importance.
Robustness of hyper parameters
Our results were robust against the choice of the hyper parameters in the models. For our weighted lasso with data integration, the inflection point in the sigmoid function is a hyper parameter and was chosen based on the biological interpretation, and put equal to 0.1. We also tried other choices, still having the same interpretation as a significance check point, and the results and pMSE were not markedly changed. In the adaptive groupregularized ridge regression different choices of group sizes were applied, but made no notable changes to the predicted mean square error. Additionally, if we randomly permuted our calculated penalty weights, the integrated methods performed similar to the classical regression methods, in other words the external randomized information is not improving the prediction and therefore discarded.
Discussing our and similar methods
A hot topic of discussion is whether or not to standardize the covariates in penalized regression [35]. If we standardize, the idea of one common penalty is more appropriate, while on the other hand rescaling the data may remove some of the (differential) signal and may lead to instabilities of variables for which the sample variances are small. Standardization is equivalent to introducing a penalty multiplier that is proportional to the variance in the unstandardized setting [35]. So if one chooses to standardize, the type of standardization (the weights) will influence the results, because some covariates may be relatively more favored than others. If the original covariates have similar range and variance, it could be best not to standardize. DNAmethylation levels are relative measures, where all values lie between 0 (no DNAmethylation) and 1 (all methylated). These DNAmethylation levels are therefore already on the same scale and thus comparable, so we have chosen not to standardize.
There are several methods allowing for individual and groupwise penalties. The weights used in adaptive lasso are simply \(l_{j} = 1/\hat {\beta }_{j}^{L}\), where \(\hat {\beta }_{j}^{L}\) is the estimate from the classical lasso, and λ is again selected by cross validation [22]. Here, the covariates, which lasso finds important in the initial run, are given a lower penalty in the second step. Tai and Pan [24], used groups of covariates, and a λ_{ g } for each group g is found by cross validation, for g=1,…,G. This means that a Gdimensional cross validation is needed. For large G this is computationally too demanding to solve. In that case, they suggested to use λ_{ g }=λl_{ g }, and let l_{ g } be the group mean of the original, not shrunken coefficients, similar to adaptive lasso. To our knowledge, [16] was the first to introduce weights based on external data. If the external data set possesses important information, the data integrated weights will strengthen the prediction, regardless of the results from the classical lasso. In [23], they build upon this and use pvalues corrected for multiple testing as a basis. But in that paper they focus on dividing the covariates into two groups, where the first group should not be subject to selection and are given weights small enough to ensure their inclusion in the model. An alternative to the mentioned group penalty methods is the group lasso [36], where instead the external data is used to group the covariates, and a ridge penalty is forced on the groups.
Biological relevance of the 22 selected CpG sites in relation to bone
The human genome DNA methylation pattern is changing throughout life, from conception to old age [37, 38]. These changes preserve our epigenetic heritage and are important for e.g. regulating tissue differentiation [39]. The present material offers a unique opportunity to study the correlation between DNA methylation and gene expression associated with a common phenotype with strong genetic disposition  BMD. In addition, we extract differentially methylated CpGs between healthy compared to osteoporotic postmenopausal women by doing a robust genomewide analysis. To document specific functions related to the 22 bone associated methylations discovered in this study, will require experiments performed in bone cell cultures and/or in gene modified animals. We present below their functional assignment in bone as it is known from the literature.
Using Ingenuity Pathway Analysis software, we identified the top canonical pathways related to our list of CpG sites, shown in Additional file 7: Table S3. The most familiar pathways in relation to bone metabolism (like the Wnt or TGFB signaling pathways) are not among the significant results in IPA. However, osteoimmunology is now an established field and bone metabolism has been shown to be affected, e.g. via CD28 receptor signaling on Tcells, which in turn affects Wnt signaling in bone cells [40]. Thus, identification of “CD28 Signaling in T Helper Cells” as the top canonical pathway may indicate that this pathway is more important for bone metabolism than previously understood. The top ranked upstream regulator ERG (ETS Transcription Factor) may act on BMD via regulation of mesenchymal cell differentiation in cooperation with TGF β [41]. It is also worth noting that we have identified miR165p, second ranked among upstream regulators, to be among the mature bone miRNAs most significantly correlated to BMD (unpublished). Cancer appeared as top ranked among related diseases and disorders. Uncontrolled cell growth may involve a vast variety of epigenetic aberrations resulting in e.g. alterations of intracellular signaling related to a number of different diseases, probably also affecting bone metabolism. Hematological and Immunological Disease ranked second and third, respectively, reflecting the importance of blood/immune cells in bone metabolism as mentioned above.
Conclusions
Overall, in our version of the weighted lasso with data integration and the adaptive groupregularized ridge regression, the key advantages are that both methods allow for ordered penalty terms and thus a tentative ranking of the covariates or groups of covariates. Both methods let the data decide upon the range of the weights and thus the relevance of the external data, by either an additional parameter α or by iterations. In addition, the penalty parameter is defined by λ_{ j }=λl_{ j } where λ is fixed to be the optimal value from the classical penalized regression without integration, which makes the computations faster and more stable. Improved predictive performance was found for both methods after integrating gene expression data. We also conclude that the presented selection method, the weighted lasso with data integration, is successful in detecting DNAmethylations that are related to BMD, verifying genes that are known to be bone related, and moreover, detecting novel transcripts of potential importance for bone metabolism and osteoporosis. The overall impact from this paper is how adaptation and improved integrative methodology, enable us to delineate the association between DNAmethylation and gene expression in the analysis of BMD.
Declarations
Acknowledgements
The authors gratefully acknowledge the support and advice by Mark van de Wiel at Vrije University in Amsterdam.
Funding
This work was funded by the University of Oslo.
Availability of data and materials
The BMD data and the gene transcript data have been published in the European Bioinformatics Institute (EMBLEBI) ArrayExpress repository, ID: EMEXP1618. The DNAmethylation data analyzed during the current study is available from the corresponding author on reasonable request.
Authors’ contributions
SR and KMG collected and prepared the data. TGL, ØB and IKG designed the study. TGL did statistical analysis under the supervision of ØB and IKG. TGL draft the manuscript. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Prior to the study, validation and recommendation were obtained by the Norwegian Local Research Ethics Committee (REK 2010/2539). All sampling and procedures followed the law and all subjects gave their informed written consent for participation.
Consent for publication
All subjects gave their informed written consent for publication of results.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
 World Health Organization. Prevention and management of osteoporosis (WHO Technical Report Series 921).Geneva: World Health Organization; 2003. http://apps.who.int/iris/bitstream/10665/42841/1/WHO_TRS_921.pdf.Google Scholar
 O’Neill T, Felsenberg D, Varlow J, Cooper C, Kanis J, Silman A. The prevalence of vertebral deformity in European men and women: the European Vertebral Osteoporosis Study. J Bone Mineral Res. 1996; 11(7):1010–8.View ArticleGoogle Scholar
 DelgadoCalle J, Riancho JA. The role of DNA methylation in common skeletal disorders. Biology. 2012; 1(3):698–713.View ArticlePubMedPubMed CentralGoogle Scholar
 GarcíaIbarbia C, DelgadoCalle J, Casafont I, Velasco J, Arozamena J, PérezNúñez MI, Alonso MA, Berciano MT, Ortiz F, PérezCastrillón JL, et al.Contribution of genetic and epigenetic mechanisms to Wnt pathway activity in prevalent skeletal disorders. Gene. 2013; 532(2):165–72.View ArticlePubMedGoogle Scholar
 Schones DE, Zhao K. Genomewide approaches to studying chromatin modifications. Nat Rev Genet. 2008; 9(3):179–91.View ArticlePubMedGoogle Scholar
 Bird A. DNA methylation patterns and epigenetic memory. Genes Dev. 2002; 16(1):6–21.View ArticlePubMedGoogle Scholar
 Wagner JR, Busche S, Ge B, Kwan T, Pastinen T, Blanchette M. The relationship between DNA methylation, genetic and expression interindividual variation in untransformed human fibroblasts. Genome Biol. 2014; 15(2):37.View ArticleGoogle Scholar
 Maunakea AK, Nagarajan RP, Bilenky M, Ballinger TJ, D’Souza C, Fouse SD, et al.Conserved role of intragenic DNA methylation in regulating alternative promoters. Nature. 2010; 466(7303):253–7.View ArticlePubMedPubMed CentralGoogle Scholar
 Reppe S, Lien TG, Hsu YH, Gautvik VT, Olstad OK, Yu R, Bakke HG, Lyle R, Kringen MK, Glad IK, et al.Distinct DNA methylation profiles in bone and blood of osteoporotic and healthy postmenopausal women. Epigenetics. 2017; 12(8):674–87.View ArticlePubMedGoogle Scholar
 Reppe S, Noer A, Grimholt RM, Halldórsson BV, MedinaGomez C, Gautvik VT, Olstad OK, Berg JP, Datta H, Estrada K, et al.Methylation of bone SOST, its mRNA, and serum sclerostin levels correlate strongly with fracture risk in postmenopausal women. J Bone Mineral Res. 2015; 30(2):249–56.View ArticleGoogle Scholar
 del Real A, PérezCampo FM, Fernández AF, Sañudo C, Ibarbia CG, PérezNúñez MI, Criekinge WV, Braspenning M, Alonso MA, Fraga MF, et al.Differential analysis of genomewide methylation and gene expression in mesenchymal stem cells of patients with fractures and osteoarthritis. Epigenetics. 2017; 12(2):113–22.View ArticlePubMedGoogle Scholar
 Hoerl AE, Kennard RW. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics. 1970; 12(1):55–67.View ArticleGoogle Scholar
 van de Wiel MA, Lien TG, Verlaat W, van Wieringen WN, Wilting SM. Better prediction by use of codata: adaptive groupregularized ridge regression. Stat Med. 2016; 35(3):368–81.View ArticlePubMedGoogle Scholar
 Zou H, Hastie T. Regularization and variable selection via the elastic net. J Am Stat Assoc. 2005; 67(2):301–20.Google Scholar
 Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B. 1996; 58(1):267–88.Google Scholar
 Bergersen LC, Glad IK, Lyng H. Weighted lasso with data integration. Stat Appl Genet Mol Biol. 2011; 10(1):1–29.View ArticleGoogle Scholar
 Reppe S, Refvem H, Gautvik VT, Olstad OK, Høvring PI, Reinholt FP, Holden M, Frigessi A, Jemtland R, Gautvik KM. Eight genes are highly associated with BMD variation in postmenopausal Caucasian women. Bone. 2010; 46(3):604–12.View ArticlePubMedGoogle Scholar
 Aryee MJ, Jaffe AE, CorradaBravo H, LaddAcosta C, Feinberg AP, Hansen KD, Irizarry RA. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics. 2014; 30(10):1363–9.View ArticlePubMedPubMed CentralGoogle Scholar
 Teschendorff AE, Marabita F, Lechner M, Bartlett T, Tegner J, GomezCabrero D, Beck S. A betamixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. Bioinformatics. 2013; 29(2):189–96.View ArticlePubMedGoogle Scholar
 Goeman JJ, Van De Geer SA, Van Houwelingen HC. Testing against a high dimensional alternative. J R Stat Soc Ser B. 2006; 68(3):477–93.View ArticleGoogle Scholar
 Benjamini Y, Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J R Stat Soc Ser B. 1995; 57(1):289–300.Google Scholar
 Zou H. The Adaptive Lasso and Its Oracle Properties. J Am Stat Assoc. 2006; 101(476):1418–29.View ArticleGoogle Scholar
 Garcia PT, Müller S, Carroll RJ, Dunn TN, Thomas AP, Adams AH, Pillai SD, Walzem RL. Structured variable selection with qvalues. Biostatistics. 2013; 14(4):695–707.View ArticlePubMedPubMed CentralGoogle Scholar
 Tai F, Pan W. Incorporating prior knowledge of predictors into penalized classifiers with multiple penalty terms. Bioinformatics. 2007; 23(14):1775–82.View ArticlePubMedGoogle Scholar
 Liu C, Jiang J, Gu J, Yu Z, Wang T, Lu H. Highdimensional omics data analysis using a variable screening protocol with prior knowledge integration (SKI). BMC Syst Biol. 2016; 10(4):457.Google Scholar
 Robertson T, Wright F, Dykstra R. Order Restricted Statistical Inference. 1988.Google Scholar
 Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010; 33(1):1–22.View ArticlePubMedPubMed CentralGoogle Scholar
 Shuang F, Sun Y, Yang HH, Shao YC, Li H, Hu W, Zhong J, Zou HX. Destrin deletion enhances the bone loss in hindlimb suspended mice. Eur J Appl Physiol. 2013; 113(2):403–10.View ArticlePubMedGoogle Scholar
 Bonewald LF. The amazing osteocyte. J Bone Miner Res. 2011; 26(2):229–38.View ArticlePubMedGoogle Scholar
 Choi HD, Noh WC, Park JW, Lee Jm, Suh JY. Analysis of gene expression during mineralization of cultured human periodontal ligament cells. J Periodontal Implant Sci. 2011; 41(1):30–43.View ArticlePubMedPubMed CentralGoogle Scholar
 Carinci F, Piattelli A, Guida L, Perrotti V, Laino G, Oliva A, Annunziata M, Palmieri A, Pezzetti F. Effects of Emdogain on osteoblast gene expression. Oral Dis. 2006; 12(3):329–42.View ArticlePubMedGoogle Scholar
 EinDor L, Kela I, Getz G, Givol D, Domany E. Outcome signature genes in breast cancer: is there a unique set?Bioinformatics. 2005; 21(2):171–8.View ArticlePubMedGoogle Scholar
 Michiels S, Koscielny S, Hill C. Prediction of cancer outcome with microarrays: a multiple random validation strategy. Lancet. 2005; 365(9458):488–92.View ArticlePubMedGoogle Scholar
 Meinshausen N, Bühlmann P. Stability selection. J R Stat Soc Ser B. 2010; 72(4):417–73.View ArticleGoogle Scholar
 Zwiener I, Frisch B, Binder H. Transforming RNASeq data to improve the performance of prognostic gene signatures. PLoS ONE. 2014;9(1).Google Scholar
 Yuan M, Lin Y. Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B. 2006; 68(1):49–67.View ArticleGoogle Scholar
 Canovas S, Ross PJ, Kelsey G, Coy P. DNA Methylation in Embryo Development: Epigenetic Impact of ART (Assisted Reproductive Technologies). BioEssays. 2017;39(11).Google Scholar
 Jones MJ, Goodman SJ, Kobor MS. DNA methylation and healthy human aging. Aging Cell. 2015; 14(6):924–32.View ArticlePubMedPubMed CentralGoogle Scholar
 Briones V, Muegge K. The ghosts in the machine: DNA methylation and the mystery of differentiation. Biochim Biophys Acta (BBA)Gene Regul Mech. 2012; 1819(7):757–62.View ArticleGoogle Scholar
 Weitzmann MN. Tcells and Bcells in osteoporosis. Curr Opin Endocrinol Diabetes Obes. 2014; 21(6):461.View ArticlePubMedPubMed CentralGoogle Scholar
 Cox MK, Appelboom BL, Ban GI, Serra R. Erg cooperates with TGF β to control mesenchymal differentiation. Exp Cell Res. 2014; 328(2):410–18.View ArticlePubMedPubMed CentralGoogle Scholar
 Illumina: Infinium HumanMethylation450 BeadChip. https://cancergenome.nih.gov/abouttcga/aboutdata/platformdesign/illuminamethylation450. Accessed 25 Oct 2017.