Skip to main content

Screening colorectal cancer associated autoantigens through multi-omics analysis and diagnostic performance evaluation of corresponding autoantibodies

Abstract

Background

This study aims to screen, validate novel biomarkers and develop a user-friendly online tool for the detection of colorectal cancer (CRC).

Methods

Multi-omics approach, comprising proteomic analysis and single-cell transcriptomic analysis, was utilized to discover candidate tumor-associated antigens (TAAs). The presence of tumor-associated autoantibodies (TAAbs) in serum was subsequently assessed using enzyme-linked immunosorbent assays (ELISA) in 300 CRC patients and 300 healthy controls. Ten machine learning algorithms were utilized to develop diagnostic models, with the optimal one selected and integrated into an R Shiny-based GUI to enhance usability and accessibility.

Results

We identified twelve potential TAAs: HMGA1, NPM1, EIF1AX, CKS1B, HSP90AB1, ACTG1, S100A11, maspin, ANXA3, eEF2, P4HB, and HKDC1. ELISA results showed that five TAAbs including anti-CKS1B, anti-S100A11, anti-maspin, anti-ANXA3, and anti-eEF2 were potential diagnostic biomarkers during the diagnostic evaluation phase (all P < 0.05). The Random Forest model yielded an AUC of 0.82 (95% CI: 0.78–0.88) on the training set and 0.75 (95% CI: 0.68–0.82) on the test set, demonstrating the robustness of the results. Web-based implementations of CRC diagnostic tools are publicly accessible via weblink https://qzan.shinyapps.io/CRCPred/.

Conclusions

A five biomarker panel can server as complementary biomarker to CEA and CA19-9 in CRC detection.

Peer Review reports

Introduction

Colorectal cancer (CRC) is the third most commonly diagnosed cancer worldwide and the second leading contributor to cancer-related mortality [1]. While CRC predominantly affects individuals over 50 years, a notable increase in incidence among younger populations has been reported [2]. Due to the poor prognosis associated with advanced-stage CRC, where five-year survival rates drop below 15%, early detection through regular screening programs is critical [3,4,5]. In clinical practice, sigmoidoscopy and colonoscopy are currently the standard tests for CRC diagnosis because of their high sensitivity and ability to detect visible precancerous lesions [6]. Additionally, stool-based tests can increase the efficiency of colonoscopy utilization [7]. Commercially available fecal immunochemical test shows a moderate sensitivity of 67.3% for CRC detection, while the multi-target stool DNA test demonstrates superior diagnostic performance with a sensitivity of 93.9% [8]. Although methylated SEPT9 is the only FDA-approved blood-based biomarker for CRC screening, its clinical utility is hampered by limited sensitivity, detecting only 44.7% of early-stage CRC and 11.2% of advanced adenomas [9]. Conventional clinical tumor markers, such as CEA and CA19-9, shows limited sensitivity, furtherly highlighting the potential need for more efficient and patient-friendly diagnosis options [10,11,12].

Tumor-associated autoantibodies (TAAb) have attracted attention as potential biomarkers for cancer diagnosis due to their stable presence in the bloodstream, even when corresponding antigen levels are low [13]. Autoantibodies can be detected earlier than the clinical onset of cancer, highlighting their value for early diagnosis [14]. Individual TAAbs have limited sensitivity and specificity, necessitating the combination of numerous TAAbs to increase diagnostic accuracy [15]. Anti‑p53 antibodies, the most extensively researched autoantibodies in CRC, may serve as biomarkers to distinguish CRC from healthy individuals or benign patients, a potential supported by a summary receiver operating characteristic curve with an AUC of 0.78 (95% CI: 0.76–0.81) [16].

However, the significance of identifying new tumor-associated antigens (TAAs) cannot be neglected, as autoantibodies, antibodies that target self-antigens, play a crucial role in modulating inflammatory responses, maintaining immune system homeostasis, and distinguishing between normal and tumor individuals in certain contexts [17]. In previous studies, the utilization of various techniques such as serological analysis of recombinant tumor cDNA expression libraries [18], phage cDNA libraries [19], serological proteome analysis(SERPA) [20], and protein microarrays [21] for TAAs identification in CRC.

Data mining is a valuable tool for identifying potentially useful patterns within large datasets, providing a more precise and reliable estimate of the efficiency of autoantibodies in CRC detection [14, 16]. The consensus molecular subtypes (CMS) of CRC, defined by integrating multi-omics data including genomics, epigenomics, transcriptomics, and immune-related proteomics, provide a comprehensive classification system that enables the identification of molecular markers with broad generalizability for CRC diagnosis [22]. Proteomic analyses elucidate distinct protein expression profiles, and previous studies also provide data for each CMS subtype [23]. Moreover, single-cell transcriptomics allows for a comprehensive analysis of the heterogeneity of CRC and modifications within the immune microenvironment at the single-cell level, facilitating the investigation of potential changes originating from epithelial cells [24, 25]. The integration of multi-omics approach facilitates the identification of novel TAAs, thereby providing a more comprehensive foundation for CRC diagnosis [26].

Our study aims to identify TAAs by using proteomic and single cell transcriptomic analysis, evaluate the diagnostic performance of their corresponding autoantibodies, and provide a scalable, cost-effective, and minimally invasive alternative to facilitate the detection of CRC.

Materials and methods

Participating patients and sample collection

This study included two groups: 300 CRC patients as CRC group, and 300 healthy controls (HCs) as HC group. Participants were matched by age (± 5 years) and gender, and were randomized in a 7:3 ratio, divided into a training set and a test set. The serum samples used in the study were from the Biological Specimen Bank of Henan Key Laboratory of Tumor Epidemiology (Henan, China) spanned from October 2020 to December 2023. All enrolled primary CRC cases were verified through pathological examination and were treatment-naïve. HCs were confirmed by reviewing their medical records to ensure they were free from malignancies or immune-related diseases. This study was approved by the Institutional Review Board of Zhengzhou University (Approval number: ZZURIB 2019-002). Written informed consent forms were obtained from all participants. All procedures were conducted in accordance with the relevant guidelines and regulations, as well as the Declaration of Helsinki. Early stages were categorized as stages 0 through II, and late stages were categorized as stages III and IV.

The blood samples were centrifuged at 3000 g for 5 min and the serum were aliquoted for long time storage in -80℃ freezer.

Identification of candidate TAAs based on multi-omics

The single-cell transcriptome data in the study were obtained from the Gene Expression Omnibus (GEO) database, including GSE132465, GSE144735 [24] and GSE200997 [27]. The proteomic data of COAD were downloaded from the Proteomic database from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) database [23]. Tumor and normal epithelial cells were compared to identify abnormally highly expressed genes. Proteomic data were used to verify the overexpression of the identified genes. Participants’ information of relevant studies is shown in Supplementary Table S1.

Single-cell transcriptome data analysis

The R package Seurat (v4.3.2) was utilized to convert the matrix count for a single sample [28]. Subsequently, genes expressed in fewer than three cells were removed. Low-quality cells were eliminated based on the following criteria: cells containing fewer than 200 expressed genes, an erythrocyte ratio exceeding 10%, or mitochondrial content above 20%. Other processes are standard procedures [28]. Bulk effect correction is performed using harmony during the integration of three data sets [29]. Cell identity annotations for individual clusters are specified based on the expression of established marker genes and verified using CellTypist [30]. Subclusters of cells with comparable gene expression profiles are then assigned to the same cell type.

Differential expression analysis for single cell transcriptomic data

The wilcox.test algorithm within the FindMarkers function of the Seurat package was employed to identify differentially expressed genes in epithelial cells, using thresholds of Log₂FC > 0.25 and adj.P < 0.05. Subsequently, the subset function was utilized to segregate cells based on CMS classification. The same criteria of Log₂FC > 0.25 and adj.P < 0.05 were applied to obtain up-regulated differentially expressed genes across various CMS subtypes.

Differential expression analysis for proteomic data

The log-ratio normalized proteomic data were directly downloaded from the CPTAC database. Differentially expressed proteins were screened with log2FC > 0.6 and adj.P < 0.05 using limma package for differential analysis [31].

Additional screening strategies are used to narrow down the candidate TAAs

Driver genes were collected from the IntOGen(CRC) [32] and OncoKB [33] databases. Genes within the CTDatabase [34] are named CT-related. A fetal gene expression signature [35,36,37] is supported by relevant articles, termed as Fetal_related.

In a separate study, the proteomic screening criteria were set at adj.P < 0.05 and log2FC > 0.3. The down-regulated genes in small intestine carcinoma (SBA) epithelial cells compared with controls intersected with up-regulated genes in CRC epithelial cells compared with controls were designated as BSA_related.

All the genes listed above can be found in the Supplementary Table S2.

Function analysis of differentially expressed genes or proteins

Gene ontology (GO) annotation was performed to better understand the biological functions of these differentially expressed proteins. GO over-representation analysis of the selected genes was performed by using the clusterProfiler package [38].

Recombinant proteins and the detection of TAAbs by ELISA

Eight proteins (NPM1, EIF1AX, CKS1B, eEF2 (encoded by EEF2), P4HB, ANXA3, S100A11, HSP90AB1) were purchased from CUSABIO (Wuhan, China), and four proteins (HMGA1, ACTG1, maspin (encoded by SERPINB5), HKDC1) were purchased from Cloud-clone Corporation (Wuhan, China). The concentration, purity, and molecular weight of all proteins were confirmed using SDS/PAGE gel. The enzyme linked immunosorbent assay was performed with a coating concentration of 0.125 ug/ml for HMGA1, NPM1, S100A11 and 0.25 ug/ml for EIF1AX, HSP90AB1, ACTG1, CKS1B, maspin, ANXA3, eEF2, P4HB, HKDC1. The ELISA procedures were described in our previous study [39, 40]. Specific Binding Index (SBI) was used to evaluate the level of autoantibodies in peripheral serum, which represents the degree of binding between antigen and antibody. SBI = (ODTBD - ODBlank) / (ODQC - ODBlank). ODTBD refers to the optical density (OD) value that needs to be determined, ODQC represents the average OD value of the quality control (QC) samples, and ODBlank denotes the average OD value of the blank wells.

Diagnostic model development

Ten machine learning algorithms were employed using the “tidymodels” R package in the training set based on the SBI values of the five TAAbs. These models include Logistic Regression, Decision Tree, Elastic Net, K-Nearest Neighbours, Light Gradient Boosting Machine, Random Forest (RF), eXtreme Gradient Boosting, Support Vector Machine, Multilayer Perceptron via nnet, and Stacking ensemble, chosen for their diverse methodologies and robust performance in identifying complex patterns within the data. A 10-fold cross-validation was performed to evaluate the predictive ability of the model. Precision-recall (PR) curves were employed to evaluate the ability of the model to discriminate, and decision curve analysis (DCA) were used to confirm the clinical effectiveness of the model further. Model performance was compared using AUCs on both training and test datasets. Statistical significance between datasets was evaluated using DeLong tests. The same statistical method was applied to assess differences in AUCs between models with equivalent sample sizes.

Statistical analysis

Data analysis and visualization was performed using SPSS Statistics 26.0 and R-4.3.2 software. Sample size calculation was performed using PASS software (Version 15, Confidence Intervals for One Proportion). Based on this analysis, the minimum sample size of the test set used in model development is 85 CRC patients and 58 HC participants. ROC analysis and the AUC with 95% CI were used to evaluate the diagnostic performance of the biomarkers and the model. The sensitivity and specificity were determined based on the cutoff value, which was defined as the SBI value at the maximum Youden index, while specificity is more than 85%. If either the TAAbs model or the clinical biomarker test yields a positive outcome, the individual is classified as positive. The corresponding positivity rate is then calculated. Three components chi-squared test is used to compare the diagnostic performance by using SPSS. Specifically, single-cell plotting is carried out using scRNAtoolVis (https://github.com/junjunlab/scRNAtoolVis) and plot1cell [41]. Analyses were judged statistically significant with a two-sided P-value of < 0.05.

Results

Study design and sample characteristics

This study was conducted in two steps: discovery of candidate TAAs (Step 1) and evaluation of TAAbs (Step 2) (Fig. 1). Proteomic data from 100 normal controls and 97 tumor cases were included in current study, and single-cell transcriptome data from 23 normal controls and 51 tumor cases. More detailed information can be found in Supplementary Table S1. A primary CRC single-cell transcriptome atlas was constructed, revealing an increased percentage of epithelial cells. Candidate TAAs were selected using screening strategies as shown in Fig. 1.

Fig. 1
figure 1

The flow diagram of this study. This study was conducted in two steps: discovery of candidate TAAs (Step 1) and evaluation of TAAbs (Step 2). TAAs Tumor-associated antigens, TAAbs Tumor-associated autoantibodies, CMS Consensus Molecular Subtypes, DEG Differential Expression Analysis, ELISA Enzyme-linked immunosorbent assay, SBA Small Bowel Adenocarcinoma, SBI Specific Binding Index

Subsequently, serum autoantibodies against 12 candidate TAAs were further evaluated by ELISA. The verification and validation phases were employed to assess the diagnostic performance of TAAbs. During the verification phase, diagnostic models were constructed in the training set, which included 210 CRC cases and 210 HC participants. In the validation phase, a test set comprising 90 CRC cases and 90 HC individuals was utilized to verify the potential diagnostic value of these TAAbs. Demographic and clinical characteristics of the study participants are presented in Table 1. Majority patients were diagnosed with stage II CRC in this study, including 118 (56.2%) in training set and 53 (58.9%) in test set. In accordance with routine clinical practice, CEA and CA19-9 were considered elevated above cutoff values of 5 ng/ml and 35 ng/ml, respectively. In both the training and test sets, the positive rates for CEA were 31.2% and 32.2%, while those for CA19-9 were 18.6% and 12.2%, respectively. In the model phase, 10 machine learning models were used for model selection and training, and hyperparameter tuning was used to improve the performance of the mode.

Table 1 Baseline characteristics of participants in the training set and test set

Identification of candidate TAAs based on multi-omics

Epithelial cells were identified using markers (EPCAM and KRT19), and the subtypes of the epithelial cells were defined based on the reference map from CellTypist [42] (Fig. 2a). Odds ratios results showed that stem-like cells and colonocytes were enriched in tumor compared to adjacent controls (Fig. 2b). Stem-like cells showed upregulated LGR5, which is consistent with the accepted histological model of intestinal epithelium [43]. Compared to adjacent normal cells, tumor colonocytes overexpressed chemokines (CXCL1, CXCL2, CXCL3, and CCL20), showing significant effects on inflammatory processes and immune cell recruitment (Fig. 2c). Eight hundred upregulated genes were identified in epithelial cells for all CMS subtype combined. Furthermore, 1,226 genes over-expressed across four CMS subtypes were identified (Fig. 2d and e). Two hundred and eighty-four up-regulated proteins were identified from proteomic data (Fig. 2f).

Fig. 2
figure 2

Identification of 12 candidate TAAbs based on multi-omics approach. a UMAP plot of CRC epithelial cells, color-coded by subcelltype subtype. b Heatmap showing the tissue preference of each cell subset as indicated by the odds ratios. c Volcano plot showing significantly upregulated or downregulated genes in each cluster, with the top five markers for each cluster highlighted. d UMAP plot of CRC epithelial cells, color-coded by CMS subtypes. e A four-set venn diagram showing the intersection of four CMS subtypes. f Volcano plot of the differentially expressed proteins between CRC and NC groups. g Upset plot for signature genes from different studies. h GO pathway enrichment analysis of 12 candidate TAAs

As shown in Figs. 2g and 72 genes were thought to play an important role in the development of CRC. Further intersecting with gene sets of interest (listed in Supplementary Table S2) in TAAs findings, six driver genes in OncoKB and three genes from a fetal gene-expression signature were identified. In addition, EEF2, P4HB and HKDC1 were derived from SBA-related genes screened by slightly different strategies.

Finally, twelve potential TAAs, derived from the proteins encoded by the genes HMGA1, NPM1, EIF1AX, CKS1B, HSP90AB1, ACTG1, S100A11, SERPINB5, ANXA3, EEF2, P4HB, and HKDC1, were selected for subsequent experimental validation and verification (Supplementary Table S3). These genes are enriched in pathways closely related to cancer initiation and progression (Fig. 2h).

Diagnostic performance in the verification phase and validation phase

In the verification phase, six TAAbs, including anti-CKS1B, anti-ACTG1, anti-S100A11, anti-maspin, anti-ANXA3, and anti-eEF2, showed significant differences between CRC patients and NC (all P < 0.05) (Fig. 3a). The AUC values ranged from 0.58 to 0.64. Specifically, the AUC values were as follows: anti-CKS1B (AUC = 0.62, 95% CI:0.57–0.67, P < 0.01), anti-ACTG1 (AUC = 0.59, 95% CI:0.54–0.65, P < 0.01), anti-S100A11 (AUC = 0.64, 95% CI:0.58–0.69, P < 0.01), anti-maspin (AUC = 0.58, 95% CI:0.52–0.63, P < 0.01), anti-ANXA3 (AUC = 0.62, 95% CI:0.57–0.67, P < 0.01), and anti-eEF2 (AUC = 0.60, 95% CI:0.54–0.65, P < 0.01) (Fig. 4a).

Fig. 3
figure 3

Serum autoantibody levels against candidate TAAs. a Scatter plot of the SBI values of the 12 candidate TAAbs in the training set. b Scatter plot of the SBI values of the 6 candidate TAAbs in the test set. *: P < 0.05, **: P < 0.01, ****: P < 0.0001, ns: P > 0.05

Fig. 4
figure 4

Diagnostic performance of the candidate TAAbs. a ROC curve of the 12 candidate TAAbs in the training set. b ROC curves of the 6 candidate TAAbs in the test set

During validation, the diagnostic performance of six significant TAAbs from the training set was further assessed, with five of them demonstrating potential diagnostic value. The results showed that the AUCs for these TAAbs ranged from 0.53 to 0.69, the sensitivity ranged from 22.22% to 37.78%, and the specificity ranged from 86.67% to 91.11% (Table 2). Among them, anti-maspin exhibited the highest diagnostic potential with an AUC of 0.69(95% CI: 0.62–0.77), a sensitivity of 35.56%, and a specificity of 86.67%. Anti-ACTG1 exhibited the lowest AUC of 0.53 (95% CI: 0.45–0.62) for CRC (P = 0.439) (Fig. 4b).

Table 2 Diagnostic performance of the 12 candidate TAAbs

The results from the validation phase were consistent with those from the verification phase except for the autoantibody against ACTG1 (Fig. 3b). Consequently, data from the five TAAbs: anti-CKS1B, anti-S100A11, anti-maspin, anti-ANXA3, and anti-eEF2 were used for diagnostic model construction. There is no strong correlation between the relevant indicators in both the training and test sets.

Diagnostic performance of the immunodiagnostic model based on machine learning

The AUCs of ten models in the training set varied from 0.67 to 0.84, and their accuracy ranged from 61.90 to 72.62% (Fig. 5a; Table 3). Similarly, in the test set, the AUC ranged from 0.63 to 0.77 and the accuracy ranged from 61.11 to 68.33% (Fig. 5d; Table 3). The DeLong test showed no difference in AUCs between the training set and test sets for each model (Table 3).

Fig. 5
figure 5

Comparative performance evaluation of 10 machine learning models for CRC diagnosis. a ROC curve analysis demonstrating the diagnostic ability associated with different models in the training set. b PR curve analysis demonstrating the ability to distinguish associated with different models in the training set. c DCA demonstrating the net benefit associated with different models in the training set. d ROC curve analysis demonstrating the diagnostic ability associated with different models in the test set. e PR curve analysis demonstrating the ability to distinguish associated with different models in the test set. f DCA demonstrating the net benefit associated with different models in the test set. LR Logistic Regression, DT Decision Tree, Enet Elastic Net, KNN K-Nearest Neighbors, LightGBM Light Gradient Boosting Machine, RF random forest, Xgboost eXtreme Gradient Boosting, SVM Support Vector Machine, MLP Multilayer Perceptron via nnet

Table 3 Diagnostic performance of the 10 machine learning algorithms in the training and test sets

The stacking model showed the optimal diagnostic performance in the training set (AUC: 0.84, 95%CI = 0.80–0.87), followed by the RF model (AUC: 0.82, 95%CI = 0.78–0.86) (Fig. 5a and d). The DeLong test showed no significant difference between the two models. Additionally, the PR curve shows that the stacking model is better than the RF model. (Figure 5b and e). But DCA indicated the clinical effectiveness of the RF model in both the training set and the test set (Fig. 5c and f). To ensure a robust comparison of the RF and Stacking models, we evaluated their performance on both the training and test datasets using Delong test. In the test set, the AUC comparison showed P = 0.204. This suggests that both the RF and Stacking models exhibit comparable generalization performance on test data. Therefore, the simpler RF model is more suitable for clinical diagnosis.

The RF model exhibited consistent performance across training and test sets. It achieved an AUC of 0.82 (95% CI: 0.78–0.86) with 68.10% accuracy in the training set (Fig. 6a and b; Table 4), and an AUC of 0.75 (95% CI:0.68–0.82) with 67.22% accuracy in the test set. (Figure 6c and d; Table 3).

Fig. 6
figure 6

Diagnostic performance of the RF model. a ROC curve of the RF model in the training set. b Confusion Matrix of the RF model in the training set. c ROC curve of the RF model in the test set. d Confusion Matrix of the RF model in the test set. e Attributes of characteristics in SHAP. Each line represents a feature, and the abscissa is the SHAP value. Blue dots represent higher eigenvalues, and red dots represent lower eigenvalues. f Bar chart of positive rate in the training set. g Bar chart of positive rate in the test set. h Bar chart of positive rate in the training set. i Bar chart of positive rate in the test set. j Bar chart of positive rate in the training set. k Bar chart of positive rate in the test set

Table 4 Subgroup performance of the Five-TAAbs model in clinical diagnostics

Interpretation and application of RF model

An online application for this RF model was developed, and SHAP was integrated to enhance the interpretability of the machine learning model, thereby increasing its utility in clinical settings. The bee swarm plot illustrates how the key characteristics in the dataset influence the model’s output, highlighting that anti-S100A11 has the highest SHAP values among all features (Fig. 6e).

To enhance its clinical utility, the RF model was deployed as a user-friendly web application accessible through a Shiny server (https://qzan.shinyapps.io/CRCPred/). Users can input TAAb values for one sample using the web interface, and the predicted likelihood of CRC diagnosis will be returned. The application leverages machine learning algorithms for real-time online computation.

Enhanced diagnostic performance of the 5-TAAbs immunodiagnostic model combined with CEA and CA19-9

Analysis of patients with CEA and CA19-9 test results demonstrated that combining the RF model with these biomarkers significantly improved the positive rate. In the training set, the RF model’s positive rate ranged from 50.3% to 50.5%, while CEA and CA19-9 achieved rates of 39.6% and 20.2%, respectively (Fig. 6f and h). Combining CEA and CA19-9 increased the positive rate to 44.3%. Notably, incorporating the RF model with CEA and CA19-9 further boosted the positive rate to 75.0%, representing a 30.7% improvement over the combination of CEA and CA19-9 alone (Fig. 6j).

Similar trends were observed in the test set. The RF model achieved a positive rate of 51.8%, while CEA and CA19-9 rates were 34.1% and 12.9%, respectively (Fig. 6g and i). The combination of CEA and CA19-9 yielded a positive rate of 36.5%. When combined with the RF model, the positive rate increased to 65.9%, representing a 29.4% improvement over the CEA and CA19-9 combination (Fig. 6k).

Subgroup analysis of the 5-TAAbs immunodiagnostic model for clinical application

Due to the limited sample size for certain clinical characteristics, the diagnostic values of the model in subgroups were primarily assessed in the training dataset. A subgroup analysis of clinical features such as age, gender, site, and stage was performed. The focus is on clinical features such as age, gender, lesion location, and disease staging. The findings revealed that the model demonstrated significantly better diagnostic efficacy for individuals aged 50 years and above compared to those under 50 years (P = 0.017) (Table 4). CRC diagnosed in individuals under 50 years is considered early-onset, while diagnosis at age 50 and above is termed late-onset CRC. Given the well-established positive correlation between age and the incidence risk of late-onset CRC, the model’s superior diagnostic performance in individuals aged 50 and over suggests its potential for future CRC screening programs targeting this high-risk population. However, the DeLong test revealed no significant difference in model performance between early and advanced stages of the disease (P = 0.085).

Discussion

CRC poses a significant global health burden, emphasizing the critical need for early diagnosis and improved patient outcomes [1]. Multi-omics analysis provided a comprehensive view of molecular changes in CRC, facilitating the identification of antigens from various biological pathways. By integrating single-cell transcriptome data from 74 individuals and proteome data from 197 individuals, this study screened 12 candidate TAAs and identified 5 TAAbs for CRC diagnosis through a two-phase ELISA validation. While the individual AUCs of these five TAAbs ranged from 0.58 to 0.64 in the training set and 0.62 to 0.69 in the test set, their combination within the RF model significantly improved diagnostic performance. Notably, combining the 5-TAAb RF model with established biomarkers like CEA or CA19-9 demonstrated superior performance in CRC detection.

The identification of TAAs is crucial for understanding CRC pathogenesis. Mitochondrial DNA transfer contributes to cancer progression by inducing epithelial cells to produce pro-cancer cytokines [44]. Furthermore, the accumulation of somatic mutations in cancer genomes results in the presence of multiple driver gene mutations within a single tumor [45]. The driver genes cataloged in the IntOGen and OncoKB may serve as potential TAAs [46]. Notably, stage-specific embryonic antigens, often present on both pluripotent stem cells and cancer stem cells, are considered potential diagnostic markers and therapeutic targets [47, 48]. By selecting genes commonly upregulated across different subtypes of CRC, we aimed to ensure broader applicability of the identified candidate genes.

In the training set, the AUC for these five TAAbs ranged from 0.58 to 0.64, while in the test set, the AUC ranged from 0.62 to 0.69. Notably, the diagnostic performance of these TAAbs in CRC has not been confirmed in previous studies. Yang et al. employed the SERPA method to identify anti-maspin and anti-ANXA3 as potential biomarkers for colon cancer, demonstrating differential expression patterns in a limited cohort of eight patients with colon adenocarcinoma and liver metastasis at various stages [49]. However, these findings lack validation in larger, population-based studies. Moreover, maspin expression is known to be downregulated during the early stages of tumorigenesis, and CEA may potentially influence its expression levels in CRC [50]. Yusuke et al. have determined that CRC patients exhibit significantly higher levels of eEF2 IgG antibodies compared to healthy individuals (P < 0.01) [51]. Studies have shown that eEF2 protein levels are significantly elevated in many different cancer types compared to normal tissues [51]. Research indicate that CRC overexpresses S100A11, and S100A11 undergoes nucleocytoplasmic translocation during cancer development, potentially impacting cancer cell proliferation [52]. Moreover, another study shows that serum S100A11 levels in CRC patients are significantly overexpressed [53]. CKS1B promotes cell proliferation by regulating the activity of cyclins and cyclin-dependent kinases, playing a key role in controlling the transition from the G1 to S phase of the cell cycle [54].

IgG isotype autoantibodies are abundant and diverse in human serum, and their levels can be influenced by disease conditions, including cancer [55]. An integrative analysis has demonstrated a significant increase in IgG-secreting plasma cells within CRC tissues compared to normal and adjacent tissues [56]. The presence of specific IgG antibodies in plasma offers a valuable resource for clinical detection [55]. Liu et al., after reviewing multiple studies, found that p53 autoantibody levels are significantly elevated in the blood of CRC patients compared to healthy controls. Furthermore, patients with negative serum p53 antibody detection exhibited longer disease-free and overall survival, highlighting its potential as a prognostic biomarker for early detection and clinical prognosis [14]. Study showed that although 69 CRC-associated TAAbs demonstrated high specificity (> 85%), their individual sensitivity was generally low (< 30%) [57, 58]. However, combining multiple autoantibodies significantly enhances sensitivity without compromising specificity, where combined sensitivity increased from 18.1%-35.1% to58.5% [57, 59]. In our study, the sensitivity of individual markers ranged from 20.48% to 36.67% in the training set and from 22.22% to 37.78% in the test set, with specificity exceeding 85% in both sets. Training an RF model using five TAAbs resulted in a sensitivity of 50.95% in the training set and 51.11% in the test set. Notably, combining this RF model with CEA and CA19-9 significantly enhanced diagnostic accuracy for CRC, emphasizing the valuable complementary role of these biomarkers alongside traditional clinical biomarkers.

The application of machine learning algorithms in the field of oncology has significant advantages in terms of accuracy and efficiency [16, 60]. Yin et al. developed an extracellular vesicle–related RF model for CRC diagnosis, achieving an AUC of 0.960 [61]. In this study, models based on the decision tree ensemble family, such as RF, LightGBM, and XGBoost, demonstrated superior performance. In comparison, while LR serves as a baseline model, its performance was notably inferior to that of tree-based models. Notably, RF and Stacking models emerged as the top two performers in terms of diagnostic accuracy, showcasing considerable generalization capability on the test set (P = 0.204). Therefore, this study selected the simpler RF model, allowing for interpretable results through SHAP.

This study presents several key strengths. Firstly, the integration of multi-omics data and CMS classification enhances the robustness of TAA selection. Secondly, rigorous validation of candidate TAAs ensures the reliability of the identified biomarkers. Thirdly, the application of machine learning algorithms improves diagnostic accuracy. Furthermore, the optimal RF model has been deployed as a Shiny app to enhance practical usability.

However, certain limitations need to be acknowledged. The sample size, while adequate for initial analysis, may limit the power of subgroup analyses. Additionally, the study population was predominantly from central China, which may limit the generalizability of the findings. Furthermore, the impact of potential confounding factors, such as smoking and alcohol consumption, was not assessed. Future large-scale, multi-center studies are warranted to further validate the diagnostic performance of the developed model and its generalizability across diverse populations.

Conclusions

In conclusion, this study has developed and validated a novel diagnostic method for CRC, utilizing a panel of TAAbs, and implementing a robust machine-learning model. This user-friendly model is accessible through a web application (qzan.shinyapps.io/CRCPred/), offering a promising tool for CRC diagnosis. The optimal model proposed in our study can significantly enhance the diagnostic performance of CEA and CA19-9.

Data availability

The datasets supporting the conclusions of this article are included within the article. All data utilized in this study are accessible from the corresponding authors upon reasonable request.

Abbreviations

CRC:

Colorectal Cancer

TAAs:

Tumor-associated Antigens

TAAbs:

Tumor-associated Autoantibodies

ELISA:

Enzyme-linked Immunosorbent Assays

SERPA:

Serological Proteome Analysis

CMS:

Consensus Molecular Subtypes

HC:

Healthy Control

GEO:

Gene Expression Omnibus

CPTAC:

Clinical Proteomic Tumor Analysis Consortium

SBA:

Small Intestine Carcinoma

GO:

Gene Ontology

SBI:

Specific Binding Index

OD:

Optical Density

QC:

Quality Control

RF:

Random Forest

PR:

Precision-recall

DCA:

Decision curve analysis

References

  1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. Ca Cancer J Clin. 2021;71:209–49.

    Article  PubMed  Google Scholar 

  2. Spaander MCW, Zauber AG, Syngal S, Blaser MJ, Sung JJ, You YN, et al. Young-onset colorectal cancer. Nat Reviews Disease Primers. 2023;9:21.

    Article  PubMed  Google Scholar 

  3. Morgan E, Arnold M, Gini A, Lorenzoni V, Cabasag CJ, Laversanne M, et al. Global burden of colorectal cancer in 2020 and 2040: incidence and mortality estimates from GLOBOCAN. Gut. 2023;72:338–44.

    Article  PubMed  Google Scholar 

  4. Hamdy NM. Unraveling the NcRNA landscape that governs colorectal cancer: A roadmap to personalized therapeutics. Life Sci. 2024;354:122946.

    Article  CAS  PubMed  Google Scholar 

  5. Rizk NI. Revealing the role of serum Exosomal novel long non-coding RNA NAMPT-AS as a promising diagnostic/prognostic biomarker in colorectal cancer patients. Life Sci. 2024;352:122850.

    Article  CAS  PubMed  Google Scholar 

  6. Zaika V, Prakash MK, Cheng C-Y, Schlander M, Lang BM, Beerenwinkel N, et al. Optimal timing of a colonoscopy screening schedule depends on adenoma detection, adenoma risk, adherence to screening and the screening objective: A microsimulation study. PLoS ONE. 2024;19:e0304374.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Coronado GD, Bienen L, Burnett-Hartman A, Lee JK, Rutter CM. Maximizing scarce colonoscopy resources: the crucial role of stool-based tests. JNCI: J Natl Cancer Inst. 2024;116:647–52.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Imperiale TF, Gagrat ZD, Garces J, Brinberg D, Limburg PJ. Next-Generation multitarget stool DNA test for colorectal cancer screening. N Engl J Med. 2024;390:984–93.

    Article  CAS  PubMed  Google Scholar 

  9. Church TR, Wandell M, Lofton-Day C, Mongin SJ, Burger M, Payne SR, et al. Prospective evaluation of methylated SEPT9 in plasma for detection of asymptomatic colorectal cancer. Gut. 2014;63:317–25.

    Article  CAS  PubMed  Google Scholar 

  10. Beniwal SS, Lamo P, Kaushik A, Lorenzo-Villegas DL, Liu Y, MohanaSundaram A. Current status and emerging trends in colorectal cancer screening and diagnostics. Biosensors. 2023;13:926.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Stiksma J, Grootendorst DC, Van Der Linden PWG. CA 19– 9 as a marker in addition to CEA to monitor colorectal cancer. Clin Colorectal Cancer. 2014;13:239–44.

    Article  PubMed  Google Scholar 

  12. Luo H, Shen K, Li B, Li R, Wang Z, Xie Z. Clinical significance and diagnostic value of serum NSE, CEA, CA19–9, CA125 and CA242 levels in colorectal cancer. Oncol Lett. 2020;20:742–50.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Liu W, Peng B, Lu Y, Xu W, Qian W, Zhang J-Y. Autoantibodies to tumor-associated antigens as biomarkers in cancer immunodiagnosis. Autoimmun Rev. 2011;10:331–5.

    Article  CAS  PubMed  Google Scholar 

  14. Liu S, Tan Q, Song Y, Shi Y, Han X. Anti-p53 autoantibody in blood as a diagnostic biomarker for colorectal cancer: A meta-analysis. Scand J Immunol. 2020;91:e12829.

    Article  PubMed  Google Scholar 

  15. Chen H, Werner S, Tao S, Zörnig I, Brenner H. Blood autoantibodies against tumor-associated antigens as biomarkers in early detection of colorectal cancer. Cancer Lett. 2014;346:178–87.

    Article  CAS  PubMed  Google Scholar 

  16. Wang H, Li X, Zhou D, Huang J. Autoantibodies as biomarkers for colorectal cancer: A systematic review, meta-analysis, and bioinformatics analysis. Int J Biol Markers. 2019;34:334–47.

    Article  CAS  PubMed  Google Scholar 

  17. Poletaev A, Pukhalenko A, Kukushkin A, Sviridov P. Detection of early cancer: genetics or immunology?? Serum autoantibody profiles as markers of malignancy. ACAMC. 2015;15:1260–3.

    Article  CAS  Google Scholar 

  18. Ran Y, Hu H, Zhou Z, Yu L, Sun L, Pan J, et al. Profiling Tumor-Associated autoantibodies for the detection of colon cancer. Clin Cancer Res. 2008;14:2696–700.

    Article  CAS  PubMed  Google Scholar 

  19. Chang W, Wu L, Cao F, Liu Y, Ma L, Wang M, et al. Development of autoantibody signatures as biomarkers for early detection of colorectal carcinoma. Clin Cancer Res. 2011;17:5715–24.

    Article  CAS  PubMed  Google Scholar 

  20. Wang H, Zhang B, Li X, Zhou D, Li Y, Jia S et al. Identification and validation of novel serum autoantibody biomarkers for early detection of colorectal cancer and advanced adenoma. Front Oncol. 2020;10.

  21. Barpanda A, Tuckley C, Ray A, Banerjee A, Duttagupta SP, Kantharia C, et al. A protein microarray-based serum proteomic investigation reveals distinct autoantibody signature in colorectal cancer. Proteom Clin Apps. 2023;17:2200062.

    Article  CAS  Google Scholar 

  22. Guinney J, Dienstmann R, Wang X, de Reyniès A, Schlicker A, Soneson C, et al. The consensus molecular subtypes of colorectal cancer. Nat Med. 2015;21:1350–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Vasaikar S, Huang C, Wang X, Petyuk VA, Savage SR, Wen B, et al. Proteogenomic analysis of human colon cancer reveals new therapeutic opportunities. Cell. 2019;177:1035–e104919.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Lee H-O, Hong Y, Etlioglu HE, Cho YB, Pomella V, Van den Bosch B, et al. Lineage-dependent gene expression programs influence the immune landscape of colorectal cancer. Nat Genet. 2020;52:594–603.

    Article  CAS  PubMed  Google Scholar 

  25. Khaliq AM, Erdogan C, Kurt Z, Turgut SS, Grunvald MW, Rand T, et al. Refining colorectal cancer classification and clinical stratification through a single-cell atlas. Genome Biol. 2022;23:113.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Alorda-Clara M, Torrens-Mas M, Morla-Barcelo PM, Martinez-Bernabe T, Sastre-Serra J, Roca P, et al. Use of omics technologies for the detection of colorectal cancer biomarkers. Cancers. 2022;14:817.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Joanito I, Wirapati P, Zhao N, Nawaz Z, Yeo G, Lee F, et al. Single-cell and bulk transcriptome sequencing identifies two epithelial tumor cell States and refines the consensus molecular classification of colorectal cancer. Nat Genet. 2022;54:963–75.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Hao Y, Hao S, Andersen-Nissen E, Mauck WM, Zheng S, Butler A, et al. Integrated analysis of multimodal single-cell data. Cell. 2021;184:3573–e358729.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Korsunsky I, Millard N, Fan J, Slowikowski K, Zhang F, Wei K, et al. Fast, sensitive and accurate integration of single-cell data with harmony. Nat Methods. 2019;16:1289–96.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Domínguez Conde C, Xu C, Jarvis LB, Rainbow DB, Wells SB, Gomes T, et al. Cross-tissue immune cell analysis reveals tissue-specific features in humans. Science. 2022;376:eabl5197.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43:e47–47.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Gonzalez-Perez A, Perez-Llamas C, Deu-Pons J, Tamborero D, Schroeder MP, Jene-Sanz A, et al. IntOGen-mutations identifies cancer drivers across tumor types. Nat Methods. 2013;10:1081–2.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Chakravarty D, Gao J, Phillips S, Kundra R, Zhang H, Wang J et al. OncoKB: A precision oncology knowledge base. JCO Precision Oncol. 2017;1:PO.17.00011.

  34. Almeida LG, Sakabe NJ, deOliveira AR, Silva MCC, Mundstein AS, Cohen T, et al. CTdatabase: a knowledge-base of high-throughput and curated data on cancer-testis antigens. Nucleic Acids Res. 2009;37:816–9. Database issue:D.

    Article  Google Scholar 

  35. Chen B, Scurrah CR, McKinley ET, Simmons AJ, Ramirez-Solano MA, Zhu X, et al. Differential pre-malignant programs and microenvironment chart distinct paths to malignancy in human colorectal polyps. Cell. 2021;184:6262–e628026.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Park Y-K, Franklin JL, Settle SH, Levy SE, Chung E, Jeyakumar LH, et al. Gene expression profile analysis of mouse colon embryonic development. Genesis. 2005;41:1–12.

    Article  CAS  PubMed  Google Scholar 

  37. Mustata RC, Vasile G, Fernandez-Vallone V, Strollo S, Lefort A, Libert F, et al. Identification of Lgr5-independent spheroid-generating progenitors of the mouse fetal intestinal epithelium. Cell Rep. 2013;5:421–32.

    Article  CAS  PubMed  Google Scholar 

  38. Yu G, Wang L-G, Han Y, He Q-Y. ClusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16:284–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Sun G, Ye H, Yang Q, Zhu J, Qiu C, Shi J, et al. Using proteome microarray and gene expression omnibus database to screen Tumour-Associated antigens to construct the optimal diagnostic model of oesophageal squamous cell carcinoma. Clin Oncol. 2023;35:e582–92.

    Article  CAS  Google Scholar 

  40. Li T, Sun G, Ye H, Song C, Shen Y, Cheng Y, et al. ESCCPred: a machine learning model for diagnostic prediction of early esophageal squamous cell carcinoma using autoantibody profiles. Br J Cancer. 2024;131:883–94.

    Article  CAS  PubMed  Google Scholar 

  41. Wu H, Gonzalez Villalobos R, Yao X, Reilly D, Chen T, Rankin M, et al. Mapping the single-cell transcriptomic response of murine diabetic kidney disease to therapies. Cell Metab. 2022;34:1064–e10786.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Xu C, Prete M, Webb S, Jardine L, Stewart B, He P et al. Automatic cell type harmonization and integration across Human Cell Atlas datasets.

  43. Malagola E, Vasciaveo A, Ochiai Y, Kim W, Zheng B, Zanella L, et al. Isthmus progenitor cells contribute to homeostatic cellular turnover and support regeneration following intestinal injury. Cell. 2024;187:3056–e307117.

    Article  CAS  PubMed  Google Scholar 

  44. Guan B, Liu Y, Xie B, Zhao S, Yalikun A, Chen W, et al. Mitochondrial genome transfer drives metabolic reprogramming in adjacent colonic epithelial cells promoting TGFβ1-mediated tumor progression. Nat Commun. 2024;15:3653.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Gires O, Pan M, Schinke H, Canis M, Baeuerle PA. Expression and function of epithelial cell adhesion molecule EpCAM: where are we after 40 years? Cancer Metastasis Rev. 2020;39:969–87.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Wang T, Liu H, Pei L, Wang K, Song C, Wang P, et al. Screening of tumor-associated antigens based on Oncomine database and evaluation of diagnostic value of autoantibodies in lung cancer. Clin Immunol. 2020;210:108262.

    Article  CAS  PubMed  Google Scholar 

  47. Andrews PW, Gokhale PJ. A short history of pluripotent stem cells markers. Stem Cell Rep. 2024;19:1–10.

    Article  CAS  Google Scholar 

  48. Long Y-Y, Wang Y, Huang Q-R, Zheng G-S, Jiao S-C. Measurement of serum antibodies against NY-ESO-1 by ELISA: A guide for the treatment of specific immunotherapy for patients with advanced colorectal cancer. Exp Ther Med. 2014;8:1279–84.

    Article  PubMed  PubMed Central  Google Scholar 

  49. Yang Q, Roehrl MH, Wang JY. Proteomic profiling of antibody-inducing immunogens in tumor tissue identifies PSMA1, LAP3, ANXA3, and Maspin as colon cancer markers. Oncotarget. 2018;9:3996–4019.

    Article  PubMed  Google Scholar 

  50. Baek JY, Yeo HY, Chang HJ, Kim K, Kim SY, Park JW, et al. Serpin B5 is a CEA-interacting biomarker for colorectal cancer. Int J Cancer. 2014;134:1595–604.

    Article  CAS  PubMed  Google Scholar 

  51. Oji Y, Tatsumi N, Fukuda M, Nakatsuka S-I, Aoyagi S, Hirata E, et al. The translation elongation factor eEF2 is a novel tumor-associated antigen overexpressed in various types of cancers. Int J Oncol. 2014;44:1461–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Cross SS, Hamdy FC, Deloulme JC, Rehman I. Expression of S100 proteins in normal human tissues and common cancers using tissue microarrays: S100A6, S100A8, S100A9 and S100A11 are all overexpressed in common cancers. Histopathology. 2005;46:256–69.

    Article  CAS  PubMed  Google Scholar 

  53. Moravkova P, Kohoutova D, Vavrova J, Bures J. Serum S100A6, S100A8, S100A9 and S100A11 proteins in colorectal neoplasia: results of a single centre prospective study. Scand J Clin Lab Invest. 2020;80:173–8.

    Article  CAS  PubMed  Google Scholar 

  54. Kashkin KN, Chernov IP, Stukacheva EA, Kopantzev EP, Monastyrskaya GS, Uspenskaya NY, et al. Cancer specificity of promoters of the genes involved in cell proliferation control. Acta Naturae. 2013;5:79–83.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Nagele EP, Han M, Acharya NK, DeMarshall C, Kosciuk MC, Nagele RG. Natural IgG autoantibodies are abundant and ubiquitous in human Sera, and their number is influenced by age, gender, and disease. PLoS ONE. 2013;8:e60726.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Chu X, Li X, Zhang Y, Dang G, Miao Y, Xu W, et al. Integrative single-cell analysis of human colorectal cancer reveals patient stratification with distinct immune evasion mechanisms. Nat Cancer. 2024;5:1409–26.

    Article  CAS  PubMed  Google Scholar 

  57. Niloofa R, De Zoysa MI, Seneviratne SL. Autoantibodies in the diagnosis, prognosis, and prediction of colorectal cancer. J Cancer Res Ther. 2021;17:819–33.

    Article  CAS  PubMed  Google Scholar 

  58. Nikolaou S, Qiu S, Fiorentino F, Rasheed S, Tekkis P, Kontovounisios C. Systematic review of blood diagnostic markers in colorectal cancer. Tech Coloproctol. 2018;22:481–98.

    Article  PubMed  PubMed Central  Google Scholar 

  59. Chan C, Fan C, Kuo Y, Chen Y, Chang P, Chen K, et al. Multiple serological biomarkers for colorectal cancer detection. Intl J Cancer. 2010;126:1683–90.

    Article  CAS  Google Scholar 

  60. Singh G. Artificial intelligence in colorectal cancer: a review. Siberian J Oncol. 2023;22:99–107.

    Article  Google Scholar 

  61. Yin H, Xie J, Xing S, Lu X, Yu Y, Ren Y, et al. Machine learning-based analysis identifies and validates serum Exosomal proteomic signatures for the diagnosis of colorectal cancer. Cell Rep Med. 2024;5:101689.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The authors would like to thank all the staff and patients who participated in our study from the First Affiliated Hospital of Zhengzhou University and the Biological Specimen Bank of Henan Key Laboratory of Tumor Epidemiology (Henan, China) for their cooperation and collaboration. Data analysis was supported by the Supercomputing Center in Zhengzhou University (Zhengzhou).

Funding

This work was funded by the Project of Basic Research Fund of Henan Institute of Medical and Pharmaceutical Sciences (2022BP0112, 2024BP0207). This study was also supported by the International Cultivation of Henan Advanced Talents Program.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, JS, ZQ, LD and KW; Project administration & Supervision, JS, PW and HY; Formal analysis, ZQ, YC, HL and TL; Methodology & Experiment, ZQ, HL, YL, DJ and XZ; Investigation, XW, ZK and LP; Data curation & Data analysis, ZQ, JS and YJ; Original draft, ZQ; Review & Editing, All authors.

Corresponding author

Correspondence to Jianxiang Shi.

Ethics declarations

Ethics approval and consent to participate

This study was approved by the Institutional Review Board of Zhengzhou University (Approval number: ZZURIB 2019-002). All participants were informed the purpose the study and provided written consent by signing consent forms.

Consent for publication

All authors have read the manuscript and have agreed to submit it in its current form for consideration for publication.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary Material 2

Supplementary Material 3

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qiu, Z., Cheng, Y., Liu, H. et al. Screening colorectal cancer associated autoantigens through multi-omics analysis and diagnostic performance evaluation of corresponding autoantibodies. BMC Cancer 25, 713 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12885-025-14080-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12885-025-14080-5

Keywords