Untargeted metabolomic approach to study the serum metabolites in women with polycystic ovary syndrome

Polycystic ovary syndrome (PCOS) is not only a kind of common endocrine syndrome but also a metabolic disorder, which harms the reproductive system and the whole body metabolism of the PCOS patients worldwide. In this study, we aimed to investigate the differences in serum metabolic profiles of the patients with PCOS compared to the healthy controls. 31 PCOS patients and 31 matched healthy female controls were recruited in this study, the clinical characteristics data were recorded, the laboratory biochemical data were detected. Then, we utilized the metabolomics approach by UPLC-HRMS technology to study the serum metabolic changes between PCOS and controls. The metabolomics analysis showed that there were 68 downregulated and 78 upregulated metabolites in PCOS patients serum compared to those in the controls. These metabolites mainly belong to triacylglycerols, glycerophosphocholines, acylcarnitines, diacylglycerols, peptides, amino acids, glycerophosphoethanolamines and fatty acid. Pathway analysis showed that these metabolites were enriched in pathways including glycerophospholipid metabolism, fatty acid degradation, fatty acid biosynthesis, ether lipid metabolism, etc. Diagnosis value assessed by ROC analysis showed that the changed metabolites, including Leu–Ala/Ile–Ala, 3-(4-Hydroxyphenyl) propionic acid, Ile–Val/Leu–Val, Gly–Val/Val–Gly, aspartic acid, DG(34:2)_DG(16:0/18:2), DG(34:1)_DG(16:0/18:1), Phe–Trp, DG(36:1)_DG(18:0/18:1), Leu–Leu/Leu–Ile, had higher AUC values, indicated a significant role in PCOS. The present study characterized the difference of serum metabolites and related pathway profiles in PCOS patients, this finding hopes to provide potential metabolic markers for the prognosis and diagnosis of this disease.


Background
Around the world, approximately 15-20% of the childbearing age women are affected with polycystic ovary syndrome (PCOS) according to the Rotterdam criteria [1]. PCOS is not only one of the most common endocrine syndrome but also a metabolic disorder, which is mainly characterized by hyperandrogenism (HA) and insulin resistance (IR). The main clinical manifestations of PCOS patients are menstrual cycle irregular, oligo-ovulation, polycystic ovarian morphology, IR induced obesity, HA induced hirsutism and acne [2]. But the diagnosis of PCOS remains a controversial issue and the criteria are continue to be updated [3][4][5]. Except for the impairment of ovarian function and the overall body metabolism, the thereby caused anovulatory infertility and recurrent pregnancy loss also have tremendous harm to PCOS patients. In addition, due to the dysfunction of the ovary and metabolism, the incidence of negative consequences, such as gynecological cancer, hypertension, atherosclerosis, type 2 diabetes mellitus (T2DM), and cardiovascular disease (CVD), also seem to be higher in PCOS women than those in normal populations [6][7][8]. In light of these risks, there is a strong need of reliable biochemical or molecular markers, which would enable to make the accurate diagnosis and effective therapy of PCOS.
Yet, the knowledge of the mechanisms underlying PCOS pathophysiology is still insufficient, and this restricts the development of available or effective therapies to ameliorate the symptoms of PCOS or related metabolic complications [9]. And shockingly, half of all women with PCOS are thought to remain undiagnosed. Genomic, proteomic, and metabolomic approaches to study the pathogenesis of diseases have been introduced to various diseases researching. Metabolomics involves the comprehensive characterization of metabolites in biological systems, and is widely applied for better disease diagnosis, understanding the potential mechanisms, identifying novel drug targets, customizing drug treatments and monitoring therapeutic outcomes [10]. The untargeted metabolomic approach, known as metabolic fingerprinting, mainly focuses on the identification and quantification of as many as possible low-molecular-weight compounds present in tested samples. This approach is commonly applied to uncover metabolic profiles, metabolic markers and to reveal new insights into the mechanisms underlying the pathogenesis of human diseases, including PCOS [11].
In this study, based on metabolomics approach by utilizing an ultra-performance liquid chromatography-high resolution mass spectrometry (UPLC-HRMS) technology, we aim to characterize the metabolic fingerprints of the PCOS patients, in hope of identifying potential metabolic marker for the prognosis and diagnosis of this disease.

Study subjects
All of the PCOS patients and healthy controls were recruited from the Zhejiang Provincial Hospital of Chinese Medicine (Hangzhou, China). This study was approved by the Ethics Committee of Zhejiang Provincial Hospital of Chinese Medicine. The signed informed consents were obtained from all the participators before inclusion in this study.
According to the Rotterdam criteria, 2003, PCOS patients can be diagnosed if two of the three criteria are present after excluding congenital adrenal hyperplasia, Cushing's syndrome, androgen secreting tumors, or other related disorders. The three criteria are (1) oligo-and/or anovulation; (2) clinical and/or biochemical signs of HA (clinical manifestations of HA include the presence of acne, hirsutism, and androgenic alopecia); (3) polycystic ovaries by ultrasound examination: the presence of 12 or more follicles in each ovary measuring 2-9 mm in diameter and/or ovarian volume > 10 cm 3 .
The inclusion criteria for PCOS cases in this study were: diagnosed with PCOS according to the Rotterdam criteria, 2003 [4]; adolescent females (18-40 years old); had at least 2 years of menstrual history. Exclusion criteria: had received any androgenic drug or sex steroid therapy in the past 3 months before the study; current pregnant, delivery or miscarriage within the preceding 3 months; congenital adrenal hyperplasia, androgensecreting tumors, and other diseases with HA, thyroid dysfunction, hyperprolactinemia, cardiovascular diseases, diabetes or any chronic diseases. The control group were healthy female volunteers: 18-40 years old, regular menstrual cycles and normal androgen levels, without PCOS and IR, and no evident disease was detected during the study. According to the above-mentioned inclusion/exclusion criteria, a total of 31 PCOS patients and 31 healthy participants were included from December of 2018 to April of 2019 in the present study.
The clinical characteristics data of the enrolled participators were recorded at the time of recruitment. After fasting for 8 h, the blood sample from each participator was collected. The serum samples were stored at − 80 ℃ for subsequent assay.

UPLC-HRMS instrumentation and measurement conditions
Untargeted metabolomics analysis was conducted by using three different analytical methods (M1-3) on an Ultimate 3000 ultra-high performance liquid chromatography coupled with Q ExactiveTM quadrupole-Orbitrap high-resolution mass spectrometer (UPLC-HRMS) system (Thermo Scientific, USA).

UPLC system
Untargeted metabolomics analysis was conducted by using three different analytical methods (M1-3). Method 1 and 2 (M1, M2) were used for the polar metabolome extracts analysis on the UPLC-HRMS system with positive and negative ionization detection, respectively. Metabolites were separated by an AcquityTM HSS C18 column (Waters Co., USA, 2.1 × 100 mm) for M1, and eluted by 0.1% formate/water (A) and acetonitrile (B) in a linear gradient from 2% organic mobile phase to 98% in 10 min. Furthermore, other mobile phases consisting of water and ammonium acetonitrile/methanol both containing ammonium bicarbonate buffer salt were employed to eluted metabolites separated on an Acqui-tyTM BEH C18 column (Waters Co., USA, 1.7 μm, 2.1 × 100 mm), the gradient was used as follow: from 0-10 min, 2% organic phase ramped to 100%, and from 10 to 15 min, column washing and equilibrating. Untargeted lipidomic analysis was operated based on Method 3 (M3), the chromatographic separation conditions were maintained under positive and negative ionization detection mode, respectively. The used column was an Accucore C30 core-shell column, the mobile phase was 60% acetonitrile in water (A) and 10% acetonitrile in isopropanol (B) both containing 10 mM ammonium formate and 0.1% formate. The separation gradient was optimized as follows: initial 10% B, ramping to 50% in 5 min, and further increasing to 100% in 23 min, then the rest 7 min for column washing and equilibration. For Method 1-3, the flow rate was 0.4 mL/min, injection volume was 5 μL, and the column temperature was 50 ℃.

Mass spectrometer system
For Method 1-2, the quadrupole-Orbitrap mass spectrometer was all operated under identical ionization parameters with a heated electrospray ionization source except ionization voltage including sheath gas 45 arb, aux gas 10 arb, heater temperature 355 ℃, capillary temperature 320 ℃ and S-Lens RF level 55%. The metabolome extracts were profiled with full scan mode under 70,000 FWHM resolution with AGC 1E6 and 200 ms max injection time. The scan range was 70-1000 m/z. QC samples were repeatedly injected to acquired Top 10 data-dependent MS2 spectra (full scan-ddMS2) for comprehensive metabolite and lipid structural annotation. 17,500 FWHM resolution settings were used for full MS/ MS data acquisition. Apex trigger, dynamic exclusion, and isotope exclusion were turned on, precursor isolation window was set at 1.0 Da. Stepped normalized collision energy was employed for collision-induced disassociation of metabolite using ultra-pure nitrogen as fragmentation gas. All the data acquired in centroid format. For Method 3, the ionized lipid molecules were detected using the same parameters as the previous description 6.3.1. 300-2000 m/z lipid extracts were profiled with the same parameters as the metabolome used. Lipid was structurally identified through acquiring data-dependent MS2 spectra, the key settings included 70,000 FWHM full scan resolution, 17,500 FWHM MS/MS resolution, loop count 10, AGC target 3e6, maximum injection time 200 ms and 80 ms for full scan, and MS/MS respectively, dynamic exclusion 8 s. Stepped normalized collision energy 25% + 40% and 35% were employed for positive and negative mode after optimization.

Metabolomics data analysis
The full scan and data-dependent MS2 metabolic profiles data were further processed with Compound Discoverer software for comprehensive component extraction. The polar metabolites were structurally annotated through searching acquired MS2 against a local proprietary iPhe-nomeTM SMOL high-resolution MS/MS spectrum library created using authentic standards, NIST 17 Tandem MS/ MS library (National Institute of Standards and Technology), local version MoNA (MassBank of North America), as well as mzCloud library (Thermo Scientific, USA). Besides, the exact m/z of MS1 spectra was searched against a local KEGG, HMDB metabolite chemical database. For metabolite identification or structural annotation, mass accuracy of precursor within ± 5 ppm was a prerequisite, meanwhile, isotopic information including at least 1 isotopes within 10 ppm and a fit score of relative isotopic abundance pattern 70% were introduced to confirm the chemical formula in addition to exact mass. Furthermore, retention time information as well as high-resolution MS/ MS spectra similarity was employed to strictly confirm the structural annotation of metabolites. The area under curve (AUC) values as extracted as quantitative information of metabolites with XCalibur Quan Browser information, all peak areas data for the annotated metabolites were exported into Excel software for trim and organization before statistics (Microsoft, USA). And on the other hand, untargeted lipidomics data was processed with LipidSearch software including peak picking, lipid identification. The acquired MS2 spectra were searching against in silico predicted spectra of a diverse phospholipid, neutral glycerolipid, sphingolipid, neutral glycosphingolipids, glycosphingolipids, steroids, fatty ester, etc. The mass accuracy for precursor and MS/MS product ions searching were 5 ppm and 5 mDa, respectively. The MS/MS similarity score threshold was set at 5. The potential ionization adduct including hydrogen, sodium, ammonium for positive and hydrogen loss, formate and acetate adduct for negative mode. The lipid identification was strictly manually checked and investigated one by one to eliminate false positives chiefly basing on peak shake, adduct ions behavior, fragmentation pattern, and chromatographic behavior.

Statistical analysis
All the clinical data were computed using SPSS18.0 version software. An unpaired, two-tailed Student t test was performed on clinical biochemical data, the chi-square test was used for comparison of categorical variables. p value < 0.05 was considered to be statistically significant. The metabolome and lipidome data deriving from different measurements were normalized to sample weight used before further process, respectively. Then, the resultant quantitative information from the foregoing methods was merged and those detected with multiple methods were excluded to guaranteed uniqueness of metabolite and lipid, and then Log10 transformed for final statistical analysis. The principal component analysis was conducted with SIMCA-P software (Umetrics, Sweden), and another univariate analyses including independent sample t-test and p value FDR adjust, as well as metabolic pathway analysis was conduct on the MetaboAnalyst website.

Clinical characteristics and biochemical data of the study subjects
The Clinical characteristics and biochemical data of the study subjects were collected and analyzed (Table 1). In this study, the study subjects included 31 healthy controls and 31 PCOS women. There are no statistical differences for the age, BMI between the two groups (p value > 0.05). For biochemical data, the levels of fasting glucose, LH, T, TG, LDL-c, and LH/FSH ratio were significantly higher in PCOS patients than those in controls, the levels of PRL, HDL-c were significantly lower in PCOS patients than those in controls (p value < 0.05).

Multivariate statistical analysis
The PCA (principal component analysis) analysis outlined the original distribution of metabolites in PCOS and control subjects. As shown in Fig. 1a, the score plot of PCA suggested that there are no obvious outlier samples in the two groups. The scatter plot classification in PCOS and control groups was observed in t [2] axis, but failed to separate in t [1] axis. Hence, a POLS-DA model was applied for further analysis. As the results indicated in Fig. 1b, PCOS samples could be clearly distinguished from the healthy control samples. The models possessed a satisfactory fit of R 2 = 0.93, Q 2 = 0.70, which indicated the significant discrimination of the serum metabolomics signature between the control and PCOS groups. In Fig. 1c, permutation plots of the OPLS-DA model repeated 999 times verified the reliability of the model. S plot of the OPLS-DA model indicated the influence of metabolite expression level on metabolic phenotype classification (Fig. 1d).

Significant changed metabolotics identification by UPLC-HRMS
The variables with FDR adjusted p value < 0.05 were selected as remarkable significance in the OPLS-DA model. As a result, a total of 146 significantly changed metabolites were identified and selected as potential biomarkers of PCOS for subsequent analysis. The volcano plot showed that compared to the control group, among these metabolites, 68 were downregulated, 78 were upregulated (Fig. 2a, Table 2). Heatmap of these 146 significantly changed metabolites in 31 PCOS samples and 31 control samples indicated that these metabolites have clustering correlativity in PCOS patients from healthy controls (Fig. 2b). After chemical structure classification of the identified 146 differential metabolites, Fig. 3 showed that the significantly changed metabolites mainly belongs to the classes of triacylglycerol (36 metabolites), glycerophosphocholine (34 metabolites), acylcarnitine (15metabolites), diacylglycerol (15 metabolites), peptide (10 metabolites), amino acid (8 metabolites

Metabolite enrichment and metabolic pathway analysis
Based on these identified metabolites, metabolic pathway analysis (MetPA) analysis was performed (Fig. 4a, Table 3). In Fig. 4a, − log (p value) and pathway impacts were the X and Y axes of the bubble diagram. It could be observed that these metabolites were significantly enriched in metabolic pathways, including glycerophospholipid metabolism, sphingolipid metabolism, phenylalanine, tyrosine and tryptophan biosynthesis, arginine biosynthesis, histidine metabolism, ether lipid metabolism. Furthermore, metabolites set enrichment  Table 4).
The results showed that purine metabolism, porphyrin and chlorophyll metabolism, FA degradation, taurine and hypotaurine metabolism, phenylalanine metabolism, phenylalanine, tyrosine and tryptophan biosynthesis, FA biosynthesis, etc., were involved in metabolic pathways of these metabolites enriched.

ROC curves of significant metabolotics in PCOS patients and controls
In order to further distinguish PCOS from controls, ROC curves analysis was also conducted on these changed metabolites. The top 10 metabolites with AUC value over 0.9 were presented in

Discussion
PCOS is a kind of common endocrine syndrome and a metabolic disorder, which harms the reproductive system and overall body metabolism of the patients seriously [2]. In this study, we investigated the metabolic changes in PCOS patients and healthy controls. The metabolomics analysis showed that in PCOS patients serum, there were 146 significantly changed metabolites, among them, 68 were downregulated, 78 were upregulated. These metabolites mainly belong to triacylglycerols, glycerophosphocholines, acylcarnitines, diacylglycerols, peptides, amino acids, glycerophosphoethanolamines, and FA. Pathway analysis showed that these metabolites were enriched in pathways including glycerophospholipid metabolism, FA degradation, FA biosynthesis, ether lipid metabolism, etc. Diagnosis value assessment by ROC analysis showed that AUC values of Leu-Ala/ Ile-Ala, 3- Metabolomics enable to identify both the endogenous metabolites from the downstream output of the genome and the exogenous metabolites from the upstream input from the environment, therefore allowing researchers to explore the nexus of gene-environment interactions and providing unique insights into the fundamental causes of disease [10,12]. To date, many metabolomic studies in PCOS have revealed the metabolic profiles and changes in PCOS patients under various conditions. In Zhang's study, they recruited 286 subjects to reveal the metabolic profiling of women with HA and IR in PCOS, the    identified 59 differential metabolites were related to the biosynthesis of unsaturated FAs and citrate cycle; these metabolites were meaningful to reflect the underlying mechanism of PCOS and serve as biomarkers for complementary diagnosis of HA and IR in PCOS [13]. Another study enrolled 10 PCOS patients and 10 healthy people, identified six biomarkers, L-Carnitine, LPE (22:5), Sphinganine, LPC (18:2), DHEAS and Glycocholic acid, these biomarkers belongs to metabolic pathway including lipid metabolism, carnitine metabolism, androgen metabolism, and bile acid metabolism [14]. Zhao's metabolomics study suggested that PCOS patients and healthy control could be distinguished using a combinational biomarker of free fatty acids (FFA) 18:1/FFA 18:0, FFA 20:3, dihydrotestosterone sulfate, glycated phenylalanine, and uridine with AUC of 0.839 [15]. These studies revealed the  . 3 Chemical structure classification of differential metabolites between PCOS and control group metabolomic changes in PCOS patients, offered new insights into disease processes, but different study subjects and metabolomic techniques used impose important limitations when aiming to integrate the results of the different studies conducted to date. In present study, over half the identified metabolites belongs to triacylglycerol (36 metabolites), glycerophosphocholine (34 metabolites), diacylglycerol (15 metabolites), and most of them were upregulated in the PCOS group. Triacylglycerol, also named triglyceride (TG), together with diacylglycerol, are the main components of lipids. As PCOS is a kind of metabolic disorders, IR and thereby induced obesity are common symptoms in PCOS patients. Hence, lipid and lipoprotein metabolic abnormalities are accompanied by the PCOS progression [16]. Previous studies also demonstrated that PCOS-associated metabolites were involved mostly in lipid metabolism [14,15,17]. Overweight PCOS patients usually have lipid abnormalities, including a higher level of serum TG. This was also observed in our biochemical test in Table 1, with elevated TG level in PCOS patients compared to the controls. Cross-sectional study showed that subjects with PCOS demonstrated higher waist:hip ratio, T, TG, VLDL-cholesterol concentrations (p < 0.05) [18]. The abnormal elevated TG level could be decreased following vitamin D supplementation for 8 weeks in PCOS women [19]. A cross-sectional study in 156 age-matched women with or without PCOS showed that diacylglycerol and triacylglycerol were inversely associated with SHBG, positively associated with homeostasis assessment of insulin resistance, free androgen index, and waist circumference [20]. This provided the evidence that specific alterations in lipid composition and function were involved in PCOS disease pathophysiology and affect PCOS clinical manifestations.
In addition, fatty acids (FAs) were also included in the identified differential metabolites in PCOS patients of this study, the three FAs (Dihomo-alpha-linolenic acid, Myristoleic acid isomer 1, Myristoleic acid isomer 2) were all downregulated in the PCOS group. Dihomoalpha-linolenic acid is a rare polyunsaturated fatty acid (PUFA) of the ω-3 series. ω-3 PUFA supplementation has a positive effect on ovarian function and potentiates the cellular development and steroid biosynthesis in PCOS [21]. PUFA could modulate hormonal and lipid profiles of the body, lowered TG and cholesterol levels, patients with PCOS usually showed abnormal levels of PUFA metabolites. The study focused on differences in FA profiles of abdominal subcutaneous adipose tissue between pregnant women with and without PCOS found that total PUFA was lower in PCOS than non-PCOS women (p < 0.004) [22]. The animal model study also showed that ω-3 PUFA had an effective role in improving lipid and hormonal profile, reducing blood glucose, body weight Fig. 4 Pathway analysis of the differential metabolites between PCOS versus CON group. a Pathway analysis result of differential metabolites between PCOS versus control group using over-representation method in MetaboAnalyst website (p value < 0.05 of t-test after FDR adjusting). Hypergeometric test and relative betweeness centrality algorithm were used for pathway topology analysis, human KEGG pathway library was used. b Metabolites set enrichment analysis of all metabolites with HMBD identifier using quantitative enrichment analysis method. Pathway-associated metabolite sets (KEGG) containing 84 metabolite sets based on normal human metabolic pathways were used for this MSEA and histopathological damages in PCOS rats [23]. Based on the positive role of FAs in normal lipid metabolism and ovarian function in PCOS, therefore, in this study, the significantly changed FAs were all down-regulated in PCOS patients, which were coincident with the previous reports.
As aforementioned, PCOS-associated metabolites were involved mostly in lipid and lipoprotein metabolic abnormalities. In the present study, pathway analysis found that these differential metabolites were associated with various pathways, especially including glycerophospholipid metabolism, sphingolipid metabolism, phenylalanine metabolism, ether lipid metabolism, purine metabolism, fatty acid degradation, fatty acid biosynthesis, etc. The untargeted metabolomics approach on PCOS follicular fluid also found significant abundance differences of glycerolipid, glycerophospholipids, sphingolipids, and carboxylic acids compared with healthy women, and these metabolism dysfunctions are contributed to declining the 2 pronuclei (PN) fertilization rate during in vitro fertilization (VIF) procedure [24]. Another LC-MS-based metabolomics showed that abnormalities of glycerophospholipid, glycerolipid, and FA metabolisms were involved in the pathogenesis of PCOS and IR complications [25]. Amino acid metabolism is also a critical metabolism pathway of the body. In this study, except for the identification of eight differential amino acids in PCOS, several related amino acid pathways were also identified, indicating the involvement of amino acid metabolism  [26]. Fatty acid-related pathways, including fatty acid degradation and biosynthesis were also found to be associated with the changed metabolites in PCOS of this study. And this was corresponded to the differential metabolites in PCOS compared to the healthy controls.

Conclusion
In this study, metabolomics analysis of PCOS patients serum identified 146 significantly varied metabolites. These differential metabolites mainly belong to triacylglycerols, glycerophosphocholines, acylcarnitines, diacylglycerols, peptides, amino acids, glycerophosphoethanolamines and FA. Pathway analysis of these metabolites revealed the metabolism disorder of PCOS in lipid metabolism, including Glycerophospholipid metabolism, Fatty acid degradation/biosynthesis, Ether lipid metabolism. Leu-Ala/Ile-Ala, 3-(4-Hydroxyphenyl) propionic acid, Ile-Val/Leu-Val, Gly-Val/Val-Gly were identified as the potential biomarkers for the diagnosis of PCOS with the AUC values over 0.98, indicated a significant role of these metabolites in PCOS. Our findings suggest that the untargeted metabolomics offers a promising approach to investigate the metabolic abnormalities in PCOS patients, this may be useful for mechanism research of PCOS and provide a good prospect for PCOS diagnosis. However, our findings remain to be further investigated by large-scale metabolomics study due to the limited size of samples used in the present study.