- Research
- Open access
- Published:
Diagnostic potential of salivary microbiota in persistent pulmonary nodules: identifying biomarkers and functional pathways using 16S rRNA sequencing and machine learning
Journal of Translational Medicine volume 22, Article number: 1079 (2024)
Abstract
Background
The aim of this study was to explore the microbial variations and biomarkers in the oral environment of patients with persistent pulmonary nodules (pPNs) and to reveal the potential biological functions of the salivary microbiota in pPNs.
Materials and methods
This study included a total of 483 participants (141 healthy controls and 342 patients with pPNs) from June 2022 and January 2024. Saliva samples were subjected to sequencing of the V3–V4 region of the 16S rRNA gene to assess microbial diversity and differential abundance. Seven advanced machine learning algorithms (logistic regression, support vector machine, multi-layer perceptron, naïve Bayes, random forest, gradient boosting decision tree, and LightGBM) were utilized to evaluate performance and identify key microorganisms, with fivefold cross-validation employed to ensure robustness. The Shapley Additive exPlanations (SHAP) algorithm was employed to explain the contribution of these core microbiotas to the predictive model. Additionally, the PICRUSt2 algorithm was used to predict the microbial functions.
Results
The salivary microbial composition in pPNs group showed significantly lower α- and β-diversity compared to healthy controls. A high-accuracy LightGBM model was developed, identifying six core genera—Fusobacterium, Solobacterium, Actinomyces, Porphyromonas, Atopobium, and Peptostreptococcus—as pPNs biomarkers. Additionally, a visualization pPNs risk prediction system was developed. The immune responses and metabolic activities differences in salivary microbiota between the patients with pPNs and healthy controls were revealed.
Conclusions
This study highlights the potential clinical applications of the salivary microbiota for enable earlier detection and targeted interventions, offering significant promise for advancing clinical management and improving patient outcomes in pPNs.
Graphical abstract

Introduction
Persistent pulmonary nodules (pPNs) are a concern as they may progress to malignancies, but often lack obvious clinical symptoms [1, 2]. Relying solely on long-term CT follow-up not only increases radiation exposure but may also result in missing the optimal treatment window for pPNs. Moreover, studies indicate that extended CT monitoring of pPNs in individuals who do not eventually develop lung cancer can lead to psychological distress, potential physical harm, and additional healthcare costs [3, 4]. Therefore, identifying reliable biomarkers to accurately characterize pPNs is crucial for improving early diagnosis and personalized management.
Over the past decade, substantial progress has been made in unraveling the complex interactions between microbiota and lung diseases. As a primary source of lung microbiota, the oral microbiota plays a critical role in the development of cancer [5, 6]. For instance, Yang et al. [7] showed that oral microbiota dysbiosis promotes lung cancer development in non-smoking women, while Zhou et al. [8] found a positive correlation between oral bacteria levels and lung cancer progression from a cancer-free state, suggesting that the oral microbiota is a significant risk factor for lung cancer. Despite these advances, the microbial characteristics and functional roles associated with the development and progression of lung cancer remain largely unknown. Moreover, current studies primarily focus on established lung cancer, with limited attention to the early stages, such as pPNs. Therefore, accurately identifying microbial features characteristic of pPNs from the vast salivary microbiota dataset and translating these findings into practical clinical diagnostic tools remains a gap in current research.
Machine learning has successfully analyzed complex biological data and identified microbial biomarkers associated with diseases, driving the development of microbiota-based risk prediction model [9, 10]. However, very few studies have compared multiple machine learning algorithms to identify and determine the optimal model for achieving the best predictive performance. Consequently, it remains unclear which machine learning approach is most effective for pPNs based on microbiota data. Furthermore, traditional data-driven approaches in machine learning lack transparency, making it challenging to meet the demands of healthcare services. Therefore, this method still requires further exploration and validation.
Our study is the first to prospectively collect saliva samples from pPN patients using high-throughput 16S rRNA sequencing and apply seven machine learning algorithms, introducing a novel non-invasive approach for the early diagnosis of pPNs.
Methods
Trial design and participants
Prospective specimen collection was performed, and Prospective Specimen Collection and Retrospective Blinded Evaluation (PRoBE) design was used [11]. From June 2022 to January 2024, participants were prospectively enrolled in a comprehensive pulmonary nodules (PNs) management and early lung cancer detection program at the Affiliated Hospital of Chengdu University of Traditional Chinese Medicine, Sichuan Cancer Hospital, and Chengdu Hospital of Integrated Traditional Chinese and Western Medicine. This study was approved by the Ethics Committee of the Affiliated Hospital of Chengdu University of Traditional Chinese Medicine (Ethics Approval No. 2022KL-051) and registered in the Chinese Clinical Trial Registry (Registration No. ChiCTR2200062140). Written informed consent was obtained from all participants.
Inclusion criteria for pPNs
PNs were incidentally discovered during chest CT scans, Patients had no prior symptoms related to the nodules [1, 12]. Regarding the nodule size and volume, those with a diameter (average of the longest axis and its perpendicular short axis on the image with the largest cross-sectional area of the lesion) ≥ 3 mm and a volume ≤ 250 mm3 were included. The following follow-up requirements were considered: newly detected PNs requiring follow-up of 3–6 months, or up to 1 year, depending on the specific characteristics of the nodules. Patients were included if their PNs did not exhibit resorption or resolution during the natural course of the disease.
Exclusion criteria
Exclusion criteria were as follows: (1) incomplete data, complete chest CT scans or saliva specimens lacking, (2) nodule characteristics, participants with calcified, fat-containing, or resorbed nodules as indicated based on a chest CT, (3) nodule size, participants with nodules > 30 mm in diameter, (4) previous lung cancer treatment, participants who underwent lung cancer treatment prior to the CT scan or saliva specimen collection, (5) recent antibiotic therapy, those receiving antibiotic treatment within 3 months before the CT scan or saliva specimen collection, (6) suspected pulmonary tuberculosis, imaging findings suggestive of pulmonary tuberculosis or no lymph node enlargement, (7) untreated infectious diseases, a history of untreated infectious diseases, (8) autoimmune and granulomatous diseases, a history of autoimmune or granulomatous disease, (9) concurrent infections or oral diseases, concurrent respiratory infections, oral diseases, or other comorbidities, (10) chronic pulmonary conditions, chronic obstructive pulmonary disease (COPD) or bronchiectasis, (11) severe cardiovascular conditions, severe acute cardiovascular or cerebrovascular emergencies, (12) recent consumption of microbial foods (e.g., yogurt, kimchi), probiotic or prebiotic products within the past 3 days.
Selection of the control group
Healthy individuals of a similar age, confirmed to have no PNs based on a chest CT, were selected as the control group. The participants also provided saliva samples. Rigorous recruitment criteria were applied to ensure comparability with the case group in terms of age, sex, smoking history, and tumor history while excluding factors that could potentially confound the study results. All chest CT images were comprehensively reviewed by two experienced radiologists specializing in thoracic imaging to ensure a consensus [13].
Sample collection
Collection of general information
Demographic data including sex, age, smoking history, and personal tumor history were collected. Lung CT tomography data were also obtained.
Oral biological sample collection
Saliva samples were prospectively collected from participants at the time of initial nodule discovery. Prior to sampling, participants rinsed their mouths with water to minimize contamination. Non-stimulated saliva (2–3 mL) was collected in sterile EP tubes. Non-stimulated saliva, characterized by lower secretion rates and a longer residence time in the oral cavity than stimulated saliva, facilitated the capture of microbial samples from various oral sites. Samples were promptly stored on dry ice and transported to a laboratory within 4 h, where they were stored at − 80 °C for subsequent experiments.
In this study, to mitigate the impact of potential sample loss, we made efforts to recruit as many participants as possible. Based on the actual recruitment conditions from June 2022 to January 2024, we initially assessed 1,094 participants for eligibility. After rigorous screening, the final sample size was adjusted to 141 healthy individuals without PNs (healthy controls, HC group) and 342 patients with pPNs (pPN group). Figure 1 provides a flowchart detailing the enrollment process and reasons for participant exclusion. A comparative analysis of the demographic and clinical characteristics of the two groups revealed no significant differences in baseline features. Detailed patient and sample collection information are provided in Tables S1 and S2.
Microbial DNA extraction and sequencing
DNA extraction was performed with the EZNA Soil Kit (Omega Bio-tek, Norcross, GA, USA), and the DNA concentration and purity were assessed using a NanoDrop 2000C spectrophotometer (Thermo Fisher Scientific, USA). The V3-V4 regions of bacterial 16S rRNA genes were amplified via PCR using barcode-indexed primers (338-F: 5′-ACTCCTACGGGAGGCAGCAG-3′ and 806-R: 5′-GGACTACHVGGGTWTCTAAT-3′) with TransStart FastPfu DNA Polymerase (TransGen BioTech, AP221-02, Beijing, China) on an ABI GeneAmp 9700 PCR system (USA). The average read length was 456 bp (ranging from 454 to 459 bp). The PCR amplification conditions were based on reference [14]. Sequencing was performed using an Illumina MiSeq platform (Illumina, San Diego, CA, USA) for paired-end reads.
Amplicon sequence processing and analysis
Quality control of paired-end raw sequences was conducted using the FASTP online platform (version 0.19.6, available at https://github.com/OpenGene/fastp) and subsequently merged using FLASH software (version 1.2.11, available at http://www.cbcb.umd.edu/software/flash), seven samples with low-quality reads or adapter contamination were excluded. Subsequently, we used the DADA2 plugin for quality filtering, denoising, merging paired reads, and removing chimeras. These steps resulted in the construction of an amplicon sequence variant (ASV) table. To classify the sequence variants at the genus level, we employed the naïve Bayes (NB) classifier method implemented in DADA2 within QIIME2 (version 2022.2). Species-level classification was determined based on exact matches (100% identity) with the reference sequences of ASVs [15].
Statistical analysis
Microbiotas diversity assessment
Α-diversity indices (Ace, Sobs, Shannon) were computed using Mothur software (http://www.mothur.org/w/Calculators) and visualized using R software (version 3.6.2). β-diversity was analyzed using principal coordinate analysis (PCoA) based on the Bray–Curtis algorithm and non-metric multidimensional scaling (NMDS) based on the Bray–Curtis algorithm, and the results are graphically presented [16].
Comparison of microbial differences between pPN and HC groups
Linear discriminant analysis effect size (LEfSe) was used to identify genus-level abundance variations between groups (http://huttenhower.sph.harvard.edu/LEfSe), with a threshold latent dirichlet allocation (LDA) value > 2 and P < 0.05. The correlation between the salivary microbial community composition and clinical characteristics was evaluated using spearman’s correlation coefficient and visualized using heatmap diagrams [17].
Microbial functional and correlation analysis
Functional pathway analysis was performed using PICRUSt2, based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Clusters of Orthologous Groups (COG). All analyses adhered to the PICRUSt2 protocol [18]. Spearman correlation coefficients were used to assess the relationship between the salivary microbial community composition and pulmonary nodules, presented visually using heatmaps and correlation maps. Differences in microbial richness, diversity, and clinical features were compared using Wilcoxon rank-sum tests, independent sample t-tests, Chi-square tests, and Fisher’s exact tests (SPSS® Statistics v22). Statistical significance was defined as P < 0.05.
Development and validation of the machine learning model
We employed seven widely accepted machine learning algorithms to develop predictive models: LR serves as a fundamental linear classification model with interpretability [19], SVM and MLP capture complex patterns and are effective in high-dimensional predictive task [20, 21], while NB [22] is a probabilistic classifier based on applying Bayes' theorem with strong independence assumptions among features, RF [20], GBDT [23], and LightGBM [24], as ensemble learning models, enhance prediction accuracy by combining multiple learners to reduce variance and bias. The dataset was randomly split into training (80%) and testing (20%) sets using a stratified shuffle split. For model evaluation, we employed fivefold cross-validation based on the training set. This involved splitting the training data into five subsets, using each subset as a validation set, with training performed on the remaining four subsets. This process was repeated five times to help us to identify the most robust experimental setup to support our conclusion. To ensure reliability, we repeated the training with different random seeds across 30 iterations, averaging the results to derive the final performance metrics [25]. This approach allowed us to select the optimal hyperparameters for each algorithm based on its performance during cross-validation.
To evaluate the model’s performance, the model with the highest area under the receiver operating characteristic curve (AUC) was selected as the optimal model of each algorithm [26, 27]. These machine learning models were implemented using Python (version 3.7) and Project Jupyter version 1.2.3 (Anaconda, Inc., https://jupyter.org/about).
Feature importance
To underscore the predictive probability of the LightGBM, we employed SHapley Additive exPlanations (SHAP) [28]. SHAP values assign an importance score to each feature and quantify their effect on the model predictions. These values measure the extent to which each feature contributes to the predictions of a model. A positive SHAP value indicates that a feature increases the prediction likelihood, whereas a negative value indicates that it decreases it. Each point on the plot represents a case in the dataset, with colors representing feature values: blue for lower and red for higher values. Using the TreeExplainer module (Python version 3.7) we computed individual SHAP values. A summarization of these results across all cases revealed the average influence of each feature on the prediction of pPN outcomes, providing a metric of the feature contribution. Finally, we employed graph layouts (Cytoscape 3.7.2) to visualize how features (via SHAP summary plots and SHAP force plots) contributed to pPN events within the LightGBM model.
Results
Richness across 16S rRNA amplicon-specific datasets
In this study, saliva samples from 483 participants underwent 16S rRNA analysis. After rigorous quality assessment and ASVs clustering, 99,914 optimized data sequence entries were obtained, with an effective sequence length of 10,635. The average read length was 456 bp (ranging from 454 to 459). Figure S1 shows that the abundance curve obtained from the Pan/Core analysis, based on the genus-level classification annotation, tended to be flat, indicating a sufficient sequencing sample size. Species annotations included one domain, one kingdom, 24 phylum, 51 classes, 115 orders, 213 families, 469 genera, 1077 species.
Changes in the salivary microbiota of the pPN group
Microbial diversity and differential analyses
First, genus-level rarefaction curve analysis confirmed that sequencing depth and coverage were adequate and that the sequencing data were robust (Figure S2). Subsequently, we compared salivary microbiota diversity between patients with pPNs and healthy individuals showed significantly lower α-diversity in the pPN group based on the Ace, Sobs, and Shannon indices, indicating reduced genus-level richness (Fig. 2A–C).
α-and β-diversity of salivary microbial communities in HC Group and pPN Group. A Ace index of salivary microbiota in both groups. B Sobs index of salivary microbiota in both groups. C Shannon index of salivary microbiota in both groups. D Intergroup differences in β-diversity of salivary microbiota. E, F Comparison of salivary microbial communities between HC Group and pPN Group using PCoA and NMDS based on Bray–Curtis distance (E, ANOSIM R2 = 0.2139, P = 0.001, F, ANOSIM R2 = 0.3139, P = 0.001). PCoA principal coordinate analysis, NMDS non-metric multidimensional scaling. P < 0.05 (*), P < 0.01 (**), P < 0.001 (***)
β-diversity analysis revealed distinct microbial community structures between the groups (Fig. 2D), and PCoA and NMDS, based on Bray–Curtis distances, revealed the tight clustering of samples within each group (Fig. 2E, F), consistent with the α-diversity findings. Overall, these findings indicate that the salivary microbiota of pPN patients is less diverse than that of healthy individuals. Similarly, recent prospective studies have found that lung cancer patients have lower oral microbiota α-diversity than healthy individuals, indicating a potential association with increased disease risk. This aligns closely with our findings.
Differences in the microbial community compositions between the pPN and HC groups
Further analyses comparing microbial community compositions between patients with pPNs and HC revealed distinct group-specific patterns. Venn diagrams showed 175 genera unique to the pPN group, 59 unique to the HC group, and 294 genera shared by both (Figure S3). Relative abundance comparisons (Fig. 3A, B, Figure S4) indicated significant differences in specific microbial genera, highlighting distinct microbial profiles between the two groups.
Differences in microbial community compositions between pPN Group and HC group. A Community bar plot displaying the percentage of community abundance of salivary microbiotas at the genus level between the two groups; B ANCOM differential abundance volcano plot showing significant differences in each ASV feature at the genus level between the two groups. The y-axis value represents the empirical distribution of W; the x-axis value represents the clr-transformed mean difference in abundance (between groups). Positive x-axis values indicate genus-level enriched in the pPN group, while negative x-axis values indicate genus-level enriched in the HC group. C Comparison of microbial differences between the two groups using intergroup difference testing (P value: Wilcoxon rank-sum test). D LEfSe analysis comparing microbial enrichment differences between the two groups (LEfSe score > 2.0). w: the number of times each feature was identified as significantly different in intergroup comparisons. clr centered log ratio, LDA linear discriminant analysis
To identify genera with statistically significant differences between the pPN and HC groups, ANCOM analysis was conducted. The result showed several salivary microbiotas with distinct abundances between the two groups (Fig. 3C). Further analysis using the Wilcoxon rank-sum test and LEfSe identified nine genera with significant differences (Figs. 3D, E). The genera Streptococcus, Haemophilus, Achromobacter, and Alloprevotella were significantly more abundant in HC Group than pPN Group. Conversely, Porphyromonas, Seilenomonas, Granulicatella, Peptostreptococcus, and Treponema were significantly more abundant in pPN Group than in HC Group. Our study further confirms the potential of salivary microbiota in distinguishing healthy individuals from patients with pPNs.
Machine learning model for predicting pPNs based on microbiota features
We employed and compared seven classical machine learning algorithms, namely LR, SVM, MLP, NB, RF, GBDT, and LightGBM to develop a predictive model for pPNs. To ensure robust conclusions, we employed k-fold cross-validation to randomly split the 483 samples into a training set (80%) and a validation set (20%), ensuring no significant differences in demographic or clinical variables between the two groups.
After constructing and training the models, we evaluated their performance using metrics such as AUC, precision-recall curve, and F1 score. The AUC values ranged from 0.728 to 0.877, demonstrating that machine learning model based on salivary microbiota features can effectively distinguish between healthy controls and pPN patients (Fig. 4A–G). Previous studies have established the feasibility of using microbiome data for lung disease classification [5]. However, most efforts have focused on gut microbiota, lung tissue, or bronchoalveolar lavage fluid samples, which are less suitable for routine screening. In contrast, our study further confirms the potential of salivary microbiota as biomarkers offers a more accessible approach for early disease detection.
Performance evaluation of seven predictive models based on salivary microbiota features. A–G. Normalized confusion matrices and corresponding AUC for LR (A), SVM (B), MLP (C), NB (D), RF (E), GBDT (F), and LightGBM (G). The confusion matrices consist of False Positives, False Negatives, True Positives, and True Negatives. H Comparison of AUC among the seven predictive models, with AUC reflecting the performance of the binary classification models. I Comparison of the F1 scores among the seven predictive models. J Comparison of the Precision-Recall Curves among the seven predictive models. F1 score and Precision-Recall Curve are comprehensive metrics for evaluating model performance, with a larger area under the curve indicating better and more stable model performance. LR logistic regression, SVM support vector machine, MLP multi-layer perceptron, NB naïve Bayes, RF random forest, GBDT gradient boosting decision tree, LightGBM Light Gradient Boosting Machine, AUC the area under the receiver operating characteristic curve
Among the models, LightGBM achieved the highest AUC (87.7%), precision-recall curve (0.955), and F1 score (0.783), making it the optimal choice for predicting pPNs based on salivary microbiota characteristics (Fig. 4H–J, Table S3).
Predictive microbial biomarkers for pPNs
Subsequently, we focus on the top 15 genera in the LightGBM model (Fig. 5A). After reviewing the literature and excluding nine unclassifiable or minimally disease-relevant genera, we identified six key salivary microbial genera based on their importance, namely Fusobacterium, Solobacterium, Atopobium, Porphyromonas, Actinomyces, and Peptostreptococcus. These genera were highly significant in the model. Notably, when using only these six genera for the machine learning predictions of pPN, the AUC was nearly identical to that when using all identified genera (AUC1 = 0.877, AUC2 = 0.872) (Fig. 5B). These findings highlight the strong discriminative power of these six genera as potential salivary microbial biomarkers of pPN.
Predictive microbial biomarkers for pPNs. A Feature importance plot displaying the top 15 genera ranked by importance scores using the LightGBM model; the alues represent the importance scores of each genus. B Comparison of the test set AUC for all microbial features versus the top 6 microbial features (AUC1 vs AUC2). C Correlation Volcano Plot showing the correlation between salivary microbiota genera and pPNs. The x-axis represents the correlation coefficient, and the y-axis represents − log10 (P value). Colors range from blue to red, indicating the transition from negative to positive correlation. D Heat map showing the correlation between 6 genera and pPNs. Colors range from blue to red, indicating the transition from negative to positive correlation. AUC the area under the receiver operating characteristic curve
Furthermore, we established a significant correlation between pPNs and the salivary microbiotas. Using Spearman’s correlation analysis, we identified a notable association between pPNs and the six core genera. The correlation map revealed significant relationships between pPNs and the salivary microbiotas, with all six genera showing significant correlations (P < 0.01) (Fig. 5C). A heatmap analysis further highlighted stronger correlations with Fusobacterium, Atopobium, Solobacterium, and Peptostreptococcus, with Atopobium showing a negative correlation (Fig. 5D).
Of note, we identified potentially complex interactions within these microbial populations. Correlations were observed among Fusobacterium, Solobacterium, and Actinomyces, as well as between Peptostreptococcus and Porphyromonas. Recent advances in microbiome research have shown that cooperation, competition, and resistance induction among microorganisms significantly influence disease prognosis [29, 30]. Our results support these findings and highlight the importance of these interactions in the context of pPNs.
Explaining features importance in the LightGBM model
To further elucidate the importance of the features in the LightGBM model, we calculated the SHAP values to visualize the relationship between variable importance and pPN prediction (Fig. 6).
SHAP Algorithm Explanation of Important Features. A SHAP summary plots showing the contribution of 6 genera to the model output. Positive SHAP values indicate an increased likelihood of the predicted outcome, while negative SHAP values indicate a decreased likelihood. The y-axis represents the feature importance ranking. Each point represents a case in the dataset, with the color indicating the feature value, blue representing the lowest range. B SHAP force plot explaining a single sample correctly classified as HC group, visually illustrating the contribution of each feature to the prediction. C SHAP force plot for a sample correctly classified as pPN group, visually illustrating the contribution of each feature to the prediction. D pPNs visualization risk prediction system. The left section allows input of data for six core microbiotas, while the right section presents the results, including the risk probability of pPNs and the contribution of feature variables to this probability. Red arrows indicate features that positively contribute to the prediction value, while blue arrows indicate features that negatively contribute. The length of the arrows represents the magnitude of the feature contribution, with the sum determining the final prediction value. SHAP SHapley Additive exPlanations
Fusobacterium was the crucial variable with the greatest influence on model prediction, followed by Actinomyces, Atopobium, Solobacterium, Peptostreptococcus, and Porphyromonas. When the abundances of Fusobacterium, Actinomyces, Solobacterium, Peptostreptococcus, and Porphyromonas were higher (indicated by pink dots), the SHAP values tended to be higher, indicating a positive correlation with pPN prediction. In contrast, Atopobium exhibited the opposite trend (Fig. 6A).
Based on violin plots, used to illustrate the differences in abundances of the six genera between the two groups (Figure S6A). Compared to the HC group, the pPN group exhibited higher average quantities or abundances of Fusobacterium, Solobacterium, Actinomyces, Peptostreptococcus, and Porphyromonas, whereas the abundance of Atopobium was lower, confirming the findings of SHAP summary plots.
The SHAP force plots were designed to explain the predictive capabilities of the model for individual cases. Figures 6B, C show two cases correctly classified into respective HC and pPN groups. For cases correctly classified as pPNs, Peptostreptococcus, Fusobacterium, and Actinomyces had the greatest effect on the output of the pPN prediction model, whereas Atopobium was the only negative factor, pushing model prediction towards the HC group (Fig. 6B). In the HC group, Atopobium, Actinomyces, and Porphyromonas drove the prediction towards the HC group, with Atopobium contributing the most (Fig. 6C). In summary, these observational findings highlight the effect of microbial feature selection on the output of the pPN prediction model.
To enhance the clinical applicability of our predictive model, we have created a visualization pPN risk prediction system that utilizes SHAP interpretations of the machine learning model outputs (http://124.223.3.36:8081/). As illustrated in Fig. 6D, the system comprises two sections: the left side for information input and the right side for result presentation. Users can enter data related to six core oral microbial genera in the left section. The upper portion of the right section displays the probability of the patient having pPN, while the lower portion illustrates the contribution of feature variables to this risk probability, assisting clinicians in formulating effective management strategies.
Potential biological functions of the salivary microbiota
Functional prediction analyses of microbial genes in saliva samples from the HC and pPN groups were performed using the PICRUSt2 algorithm, with biological functions annotated via the KEGG and COG databases. Differential enrichment analysis (Wilcoxon rank-sum test, P < 0.05) revealed that although most functional pathways were similar between the two groups, significant differences were observed in pathways related to amino acid metabolism (K02029), rRNA processing and RNA modification (K06180), cellular transport and metabolism (K02035), and antigen presentation (K02004, K02003) in the pPN group (Fig. 7A, B). COG functional classification further supported these findings, showing the enrichment of genes related to coenzyme transport, protein synthesis, DNA repair, and immune defense in the pPN group (Fig. 7C, D).
Predicted microbiota functions using the PICRUSt2 algorithm. A Heatmap displaying enriched KEGG pathways between the HC and pPN groups, with colors ranging from orange to green, indicating low to high correlation. B Bar chart displaying KEGG pathways with significant differences between the HC and pPN groups. C Box plot showing the distribution differences of various COG functional categories between the sample groups. D Bar chart displaying COG pathways with significant differences between the HC and pPN groups. P < 0.05 (*), P < 0.01 (**), P < 0.001 (***). KEGG Kyoto Encyclopedia of Genes and Genomes, COG Clusters of Orthologous Groups
Discussion
Currently, a widely applicable noninvasive method to accurately assess pPN status is lacking. In this study, we proposed a novel strategy based on salivary microbiota characteristics to differentiate between patients with pPNs and the HC group using 16S rRNA analysis and machine learning models. This approach identified six core genera as potential predictive biomarkers for pPNs and developed an efficient, easy-to-use machine learning model for prediction. Additionally, we explored differences in biological functions between the pPN and HC groups. These findings support the potential of salivary microbiota as biomarkers for early detection of pPNs, providing reliable diagnostic tools and personalized management strategies for clinicians.
The relationship between the oral microbiota and respiratory diseases has been extensively investigate [31, 32]. The oral cavity is connected to the lower respiratory tract, making the oral microbiota a potential factor that influences the pulmonary microenvironment. Increasing evidence indicates that changes in oral microbiota diversity are associated with the development and prognosis of diseases, such as COPD, asthma, and lung cancer [33]. Our results showed that salivary microbiota α- and β-diversity indices in patients with pPNs were lower than those in HC group. Subsequent Wilcoxon rank-sum tests and LEfSe analyses revealed significant differences in various salivary microbiotas between the HC and pPN groups. Previous studies have identified a positive correlation between the relative abundance of certain genera and the risk of lung cancer [34]. Our study utilized oral samples and, for the first time, identified distinct microbial communities between patients with pPN and healthy individuals. This discovery provides new biomarkers and detection methods for non-invasive diagnosis of pPNs.
The high-throughput sequencing data generated from 16S rRNA sequencing is characterized by high dimensionality, sparsity, and noise. Traditional statistical methods often struggle to identify consistent and robust biomarkers when dealing with these large and complex microbiome dataset [35]. To address this challenge, we adopted a comprehensive approach that integrates 16S rRNA sequencing with machine learning algorithms, enabling efficient processing of a wide range of microbiome data [36]. We employed seven widely accepted machine learning algorithms to develop predictive models. Through comparison, we found that ensemble learning models, particularly LightGBM, outperformed the others in predictive accuracy (AUC = 0.877). This is consistent with the results of previous studies comparing the performance of multiple models [37, 38]. LightGBM’s superior performance is due to its efficient data handling capabilities and robust feature utilization during training, making it particularly effective for large datasets, such as microbiota data. LightGBM also demonstrates excellent generalization capability, especially with smaller sample sizes, which suits our dataset well. Its stability ensures consistent predictive performance, and its robustness allows it to handle noise and outliers effectively. This is the first study to develop and compare multiple widely accepted machine learning models for predicting pPNs. Our findings reveal that the LightGBM model offers superior predictive power in distinguishing healthy individuals from patients with pPNs.
Microbiota features that contribute significantly to disease prediction models are commonly viewed as emerging biomarkers for disease diagnosis and future mechanistic studies [39]. Using LightGBM, we identified six core genera associated with pPNs (Fusobacterium, Solobacterium, Actinomyces, Porphyromonas, Atopobium, and Peptostreptococcus). Moreover, in our independent validation cohort, the model maintained excellent performance, consistent with previous results, effectively distinguishing patients with pPNs from healthy individuals using only these six core genera. There is a need for further investigation into microbiota modulation as a therapeutic intervention, exploring how manipulating the composition and functionality of the microbiota can positively influence disease outcomes.
Subsequently, the SHAP algorithm was used to elucidate the interpretability of the key features and clarify their significant associations with pPNs, thereby enhancing the transparency and trustworthiness of the results produced by the machine learning models. These features are identified as the core microbiotas associated with pPNs. Specifically, the positive correlations between pPNs and Fusobacterium, Actinomyces, Porphyromonas, Solobacterium, and Peptostreptococcus suggest their importance. Moreover, the negative correlation between Atopobium and pPNs might imply its potential inhibitory role in the progression of pPNs. Collectively, these findings highlight the critical importance of these microbial features and emphasize the necessity for further investigation into how modulation of these microbial populations could influence disease outcomes.
A SHAP-based risk prediction system aids in classifying patient risk, helping clinicians identify risk factors for high-risk patients and monitor early signs in low-risk patients. It supports personalized management and reduces psychological stress and healthcare burdens. However, it is important to consider that integrating this system with different data modalities into clinical workflows presents challenges, such as handling text, CT images, and patient demographics, and a standardized data preprocessing pipeline could be established in the future. This pipeline would include procedures for identifying outliers, standardizing data formats, and addressing missing data. Following this, a multimodal machine learning model could be developed to process these diverse datasets, ultimately generating a single predictive probability. Addressing these challenges will necessitate further refinements to ensure compatibility with various clinical practices.
Of note, similarities and overlap were observed between the pPN-associated and lung cancer-associated microbiota. A previous case–control study identified Solobacterium as a significant oral pathogen that causes oral diseases. Further, a prospective cohort study showed that various oral microbiota imbalances are significantly associated with lung cancer risk [34, 40]. Several studies have also shown that anaerobic bacteria such as Fusobacterium, Solobacterium, Porphyromonas, Actinomyces, and Peptostreptococcus are more abundant in the oral or lower respiratory tract of patients with lung cancer, which is consistent with our findings. Notably, bacteria such as Fusobacterium and Porphyromonas, which are known for their potential involvement in early tumorigenesis, were also identified. Similar studies, such as those by Abed et al. [41], demonstrated the presence of Fusobacterium DNA in tumor tissues of lung adenocarcinoma and confirmed it was associated with increased tumor mortality and malignancy. Additionally, elevated levels of Porphyromonas gingivalis have been reported in patients with early-stage lung cancer compared to those in healthy controls, suggesting its potential involvement in the initiation of lung cancer, possibly even in the presence of PNs [42, 43]. Both Fusobacterium and Porphyromonas have been implicated in inducing the release of proinflammatory cytokines, adhesion molecules, and growth factors, thereby promoting a favorable tumor microenvironment [44, 45], suggesting that these bacteria promote inflammation and tumorigenesis.
This is similar to our predictions using the advanced PICRUSt2 algorithm, which highlighted the differences in bacterial gene functions between the groups, supporting the notion that the microbiota might indirectly facilitate the development of diseases, such as pPNs, by modulating the host immune responses and metabolic activities. Notably, our functional analysis identified unique pathways, such as oxidative phosphorylation, Persistent dysregulation of immune function [46]. These insights highlight potential new mechanisms by which persistent immune dysregulation and metabolic reprogramming might contribute to a pro-tumor microenvironment, ultimately promoting disease progression [47]. Future research integrating metabolomics and proteomics could elucidate how microbiotas modulate metabolic pathways, immune responses, and protein signaling, providing a more comprehensive understanding of pPNs mechanisms and identifying therapeutic targets.
We also observed a negative correlation between persistent pPNs and salivary microbiota Atopobium abundance. Consistent with our findings, previous studies have shown that Atopobium is less abundant in patients with lung cancer than in healthy controls [48]. Several studies Studies have shown that specific antigens of Atopobium spp. can trigger immune responses leading to pulmonary sarcoidosis [30]. Zimmermann et al. [49] identified Atopobium spp. as a novel candidate microbiota component associated with pulmonary sarcoidosis. However, previous research has shown inconsistencies. Earlier studies indicated a higher relative abundance of Atopobium in individuals with non-small cell lung cancer, suggesting an associated increased risk [50]. These inconsistencies may stem from differences in study populations or methods. Overall, our study suggests that greater abundances of Fusobacterium, Actinomyces, Solobacterium, Peptostreptococcus, and Porphyromonas might contribute to the development of pPNs, whereas Atopobium might inhibition this condition.
Importantly, microbial interactions have a pivotal role in the progression of complex diseases [29]. Our study specifically reveals correlations among Fusobacterium, Solobacterium, and Actinomyces, as well as between Peptostreptococcus and Porphyromonas. Previous studies have indicated that Fusobacterium, Actinomyces, and Porphyromonas have symbiotic relationships in chronic inflammatory diseases, potentially exacerbating infection severity [51, 52]. We speculate that their interactions within the pulmonary microenvironment could promote sustained inflammatory responses, affect the local immune balance, and potentially contribute to the malignant transformation of pPNs.
Although these hypotheses are intriguing, the current findings have only begun to uncover the true complexity of the relationship between microorganisms and persistent pPNs. This study had several limitations.
First, there are certain limitations to using the salivary microbiota as a biomarker. Factors like diet, oral hygiene, health status, gender, and age can influence microbiota composition, potentially affecting model performance across different populations. Validation in diverse pPN groups is needed to improve robustness. Second, we explored the association between the baseline salivary microbiotas (at the time of the initial CT scan) and pPNs, we did not establish causality. Large-scale, multi-center prospective cohort studies with longitudinal tissue and microbial sample collection are necessary to confirm whether core microbiotas are key risk factors for pPNs.
Conclusion
In summary, the current results revealed the potential of Fusobacterium, Solobacterium, Actinomyces, Porphyromonas, Atopobium, and Peptostreptococcus to distinguish between patients with pPNs and healthy individuals, suggesting their potential as predictive microbial biomarkers of pPNs. These microbial differences might contribute to immune dysfunction and metabolic reprogramming, thus providing insights into pPN-associated microbial imbalances. This study highlights the potential role of these microbes in pPNs. We recommend future research focus on clinical validation of these biomarkers and their integration into routine screening. Pursuing these directions could enhance early lung cancer detection and provide valuable insights into microbial factors in cancer prevention.
Availability of data and materials
The raw metagenomic data generated in this study have been deposited in the NCBI Sequence Read Archive, with the accession code PRINA1114406. All code used in the manuscript is available at: https://github.com/qiong0129/HC-pPN.
References
MacMahon H, Naidich DP, Goo JM, Lee KS, Leung ANC, Mayo JR, et al. Guidelines for management of incidental pulmonary nodules detected on CT images: from the Fleischner Society 2017. Radiology. 2017;284:228–43.
Wahidi MM, Govert JA, Goudar RK, Gould MK, McCrory DC. Evidence for the treatment of patients with pulmonary nodules: when is it lung cancer?: ACCP evidence-based clinical practice guidelines (2nd edition). Chest. 2007;132:94S-107S.
Wiener RS, Gould MK, Woloshin S, Schwartz LM, Clark JA. “The thing is not knowing”: patients’ perspectives on surveillance of an indeterminate pulmonary nodule. Health Expect. 2015;18:355–65.
Freiman MR, Clark JA, Slatore CG, Gould MK, Woloshin S, Schwartz LM, et al. Patients’ knowledge, beliefs, and distress associated with detection and evaluation of incidental pulmonary nodules for cancer: results from a multicenter survey. J Thorac Oncol. 2016;11:700–8.
Goto T. Microbiota and lung cancer. In: Seminars in cancer biology. Amsterdam: Elsevier; 2022. p. 1–10.
Georgiou K, Marinov B, Farooqi AA, Gazouli M. Gut microbiota in lung cancer: where do we stand? Int J Mol Sci. 2021;22:10429.
Yang J, Mu X, Wang Y, Zhu D, Zhang J, Liang C, et al. Dysbiosis of the salivary microbiome is associated with non-smoking female lung cancer and correlated with immunocytochemistry markers. Front Oncol. 2018;8:520.
Zhou B, Lu J, Beck JD, Moss KL, Prizment AE, Demmer RT, et al. Periodontal and other oral bacteria and risk of lung cancer in the atherosclerosis risk in communities (ARIC) study. Cancer Epidemiol Biomarkers Prev. 2023;32:505–15.
Su Q, Liu Q, Lau RI, Zhang J, Xu Z, Yeoh YK, et al. Faecal microbiome-based machine learning for multi-class disease diagnosis. Nat Commun. 2022;13:6818.
Santos-Júnior CD, Torres MDT, Duan Y, Rodríguez Del Río Á, Schmidt TSB, Chong H, et al. Discovery of antimicrobial peptides in the global microbiome with machine learning. Cell. 2024;S0092–8674(24):00522–31.
Pepe MS, Etzioni R, Feng Z, Potter JD, Thompson ML, Thornquist M, et al. Phases of biomarker development for early detection of cancer. J Natl Cancer Inst. 2001;93:1054–61.
Riely GJ, Wood DE, Ettinger DS, Aisner DL, Akerley W, Bauman JR, et al. Non-small cell lung cancer, version 4.2024, NCCN Clinical Practice Guidelines in Oncology. J Natl Compr Canc Netw. 2024;22:249–74.
Li M, Shao D, Fan Z, Qin J, Xu J, Huang Q, et al. Non-invasive early detection on esophageal squamous cell carcinoma and precancerous lesions by microbial biomarkers combining epidemiological factors in China. J Gastroenterol. 2024;59:531–42.
Klindworth A, Pruesse E, Schweer T, Peplies J, Quast C, Horn M, et al. Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies. Nucleic Acids Res. 2013;41:e1–e1.
Syromyatnikov MY, Kokina AV, Solodskikh SA, Panevina AV, Popov ES, Popov VN. High-throughput 16S rRNA gene sequencing of butter microbiota reveals a variety of opportunistic pathogens. Foods. 2020;9:608.
Xu L, Zhang C, He D, Jiang N, Bai Y, Xin Y. Rapamycin and MCC950 modified gut microbiota in experimental autoimmune encephalomyelitis mouse by brain gut axis. Life Sci. 2020;253: 117747.
Segata N, Izard J, Waldron L, Gevers D, Miropolsky L, Garrett WS, et al. Metagenomic biomarker discovery and explanation. Genome Biol. 2011;12:R60.
Goh CE, Bohn B, Marotz C, Molinsky R, Roy S, Paster BJ, et al. Nitrite generating and depleting capacity of the oral microbiome and cardiometabolic risk: results from ORIGINS. J Am Heart Assoc. 2022;11: e023038.
Keating KA, Cherry S. Use and interpretation of logistic regression in habitat-selection studies. J Wildl Manag. 2004;68:774–89.
Glaab E, Trezzi J-P, Greuel A, Jäger C, Hodak Z, Drzezga A, et al. Integrative analysis of blood metabolomics and PET brain neuroimaging data for Parkinson’s disease. Neurobiol Dis. 2019;124:555–62.
McKenna SJ, Amaral T, Akbar S, Jordan L, Thompson A. Immunohistochemical analysis of breast tissue microarray images using contextual classifiers. J Pathol Inform. 2013;4:S13.
Lowd D, Domingos P. Naive Bayes models for probability estimation. In: Proceedings of the 22nd international conference on Machine learning. 2005. p. 529–36.
Lundberg SM, Nair B, Vavilala MS, Horibe M, Eisses MJ, Adams T, et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat Biomed Eng. 2018;2:749–60.
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, et al. Lightgbm: a highly efficient gradient boosting decision tree. In: Advances in neural information processing systems. 2017;30.
Niemantsverdriet MSA, de Hond TAP, Hoefer IE, van Solinge WW, Bellomo D, Oosterheert JJ, et al. A machine learning approach using endpoint adjudication committee labels for the identification of sepsis predictors at the emergency department. BMC Emerg Med. 2022;22:208.
Mardini MT, Bai C, Wanigatunga AA, Saldana S, Casanova R, Manini TM. Age differences in estimating physical activity by wrist accelerometry using machine learning. Sensors (Basel). 2021;21:3352.
Talebi A, Celis-Morales CA, Borumandnia N, Abbasi S, Pourhoseingholi MA, Akbari A, et al. Predicting metastasis in gastric cancer patients: machine learning-based approaches. Sci Rep. 2023;13:4163.
Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017;30.
Wang Q, Liu X, Jiang L, Cao Y, Zhan X, Griffin CH, et al. Interrogation of internal workings in microbial community assembly: play a game through a behavioral network? mSystems. 2019;4.
Alam J, Kim YC, Choi Y. Potential role of bacterial infection in autoimmune diseases: a new aspect of molecular mimicry. Immune Netw. 2014;14:7–13.
Sun Y, Liu Y, Li J, Tan Y, An T, Zhuo M, et al. Characterization of lung and oral microbiomes in lung cancer patients using culturomics and 16S rRNA gene sequencing. Microbiol Spectr. 2023;11: e0031423.
Dong J, Li W, Wang Q, Chen J, Zu Y, Zhou X, et al. Relationships between oral microecosystem and respiratory diseases. Front Mol Biosci. 2021;8: 718222.
Zhou Y, Zeng H, Liu K, Pan H, Wang B, Zhu M, et al. Microbiota profiles in the saliva, cancerous tissues and its companion paracancerous tissues among Chinese patients with lung cancer. BMC Microbiol. 2023;23:237.
Vogtmann E, Hua X, Yu G, Purandare V, Hullings AG, Shao D, et al. The oral microbiome and lung cancer risk: an analysis of 3 prospective cohort studies. J Natl Cancer Inst. 2022;114:1501–10.
Kino S, Hsu Y-T, Shiba K, Chien Y-S, Mita C, Kawachi I, et al. A scoping review on the use of machine learning in research on social determinants of health: trends and research prospects. SSM Popul Health. 2021;15: 100836.
Bzdok D. Classical statistics and statistical learning in imaging neuroscience. Front Neurosci. 2017;11:543.
Le C, Deleat-Besson R, Turkestani NA, Cevidanes L, Bianchi J, Zhang W, et al. TMJOAI: an artificial web-based intelligence tool for early diagnosis of the temporomandibular joint osteoarthritis. Clin Image Based Proced Distrib Collab Learn Artif Intell Combat COVID 19 Secur Priv Preserv Mach Learn. 2021;12969:78–87.
Zeng F, Su X, Liang X, Liao M, Zhong H, Xu J, et al. Gut microbiome features and metabolites in non-alcoholic fatty liver disease among community-dwelling middle-aged and older adults. BMC Med. 2024;22:104.
Asnicar F, Thomas AM, Passerini A, Waldron L, Segata N. Machine learning for microbiologists. Nat Rev Microbiol. 2024;22:191–205.
Liu X, Zou L, Nie C, Qin Y, Tong X, Wang J, et al. Mendelian randomization analyses reveal causal relationships between the human microbiome and longevity. Sci Rep. 2023;13:5127.
Abed J, Maalouf N, Parhi L, Chaushu S, Mandelboim O, Bachrach G. Tumor targeting by fusobacterium nucleatum: a pilot study and future perspectives. Front Cell Infect Microbiol. 2017;7:295.
Druzhinin VG, Matskova LV, Demenkov PS, Baranova ED, Volobaev VP, Minina VI, et al. Taxonomic diversity of sputum microbiome in lung cancer patients and its relationship with chromosomal aberrations in blood lymphocytes. Sci Rep. 2020;10:9681.
Liu Y, Yuan X, Chen K, Zhou F, Yang H, Yang H, et al. Clinical significance and prognostic value of Porphyromonas gingivalis infection in lung cancer. Transl Oncol. 2021;14: 100972.
Stokowa-Sołtys K, Wojtkowiak K, Jagiełło K. Fusobacterium nucleatum—friend or foe? J Inorg Biochem. 2021;224: 111586.
Velsko IM, Chukkapalli SS, Rivera-Kweh MF, Zheng D, Aukhil I, Lucas AR, et al. Periodontal pathogens invade gingiva and aortic adventitia and elicit inflammasome activation in αvβ6 integrin-deficient mice. Infect Immun. 2015;83:4582–93.
Forrester SJ, Kikuchi DS, Hernandes MS, Xu Q, Griendling KK. Reactive oxygen species in metabolic and inflammatory signaling. Circ Res. 2018;122:877–902.
Fridman WH, Pagès F, Sautès-Fridman C, Galon J. The immune contexture in human tumours: impact on clinical outcome. Nat Rev Cancer. 2012;12:298–306.
Gao F, Yu B, Rao B, Sun Y, Yu J, Wang D, et al. The effect of the intratumoral microbiome on tumor occurrence, progression, prognosis and treatment. Front Immunol. 2022;13:1051987.
Zimmermann A, Knecht H, Häsler R, Zissel G, Gaede KI, Hofmann S, et al. Atopobium and Fusobacterium as novel candidates for sarcoidosis-associated microbiota. Eur Respir J. 2017;50:1600746.
Cheng J, Zhou L, Wang H. Symbiotic microbial communities in various locations of the lung cancer respiratory tract along with potential host immunological processes affected. Front Cell Infect Microbiol. 2024;14:1296295.
Salipante SJ, Hoogestraat DR, Abbott AN, SenGupta DJ, Cummings LA, Butler-Wu SM, et al. Coinfection of Fusobacterium nucleatum and Actinomyces israelii in mastoiditis diagnosed by next-generation DNA sequencing. J Clin Microbiol. 2014;52:1789–92.
Ohashi A, Yamamura T, Nakamura M, Maeda K, Sawada T, Ishikawa E, et al. Network analysis of gut microbiota including fusobacterium and oral origin bacteria and their distribution on tumor surface, normal mucosa, and in feces in patients with colorectal cancer. Digestion. 2022;103:451–61.
Acknowledgements
The authors thank all the study staff and patients who participated in the study.
Funding
This study was supported by the China Postdoctoral Science Foundation (2023MD744129), (2023MD 734101); Natural Science Foundation project of Sichuan Science and Technology Department (2023NSFSC1815), and Sichuan Provincial Administration of Traditional Chinese Medicine (2023ZD06).
Author information
Authors and Affiliations
Contributions
XZ: conceptualization, methodology, writing-original draft. QM: conceptualization, methodology, writing-original draft, investigation, writing. CXH, JJX: investigation, analysis, visualization. XF: software, investigation, YFR: analysis, visualization, funding. YLQ: analysis, visualization. HXX, ML, RYZ, YZ: data collection, investigation. PX, XZ: writing—review and editing. FMY: writing—review and editing, funding. JWH: conceptualization, writing-review and editing, funding. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
This study was approved by the Ethics Committee of the Affiliated Hospital of Chengdu University of Traditional Chinese Medicine (Ethics Approval No. 2022KL-051) and registered in the Chinese Clinical Trial Registry (Registration No. ChiCTR2200062140). Written informed consent was obtained from all participants.
Competing interests
The authors declare that they have no competing interests authors' contributions.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zeng, X., Ma, Q., Huang, CX. et al. Diagnostic potential of salivary microbiota in persistent pulmonary nodules: identifying biomarkers and functional pathways using 16S rRNA sequencing and machine learning. J Transl Med 22, 1079 (2024). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12967-024-05802-7
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12967-024-05802-7