基于蛋白质组学联合转录组学的肺腺癌预后模型和标志物研究

Research on prognostic models and biomarkers of lung adenocarcinoma using integrated proteomics and transcriptomics

  • 摘要:
    目的 基于蛋白质组学和转录组学筛选肺腺癌预后生物标志物并构建预后模型。
    方法 从TCGA公共数据库下载肺腺癌的蛋白质组学、转录组学及患者临床特征数据。按照7∶3比例将数据集分成训练组和验证组。根据患者临床生存时间、生存状态和蛋白表达数据, 在训练集中进行蛋白表达单因素预后分析。采用Lasso-step Cox方法,构建肺腺癌患者预后模型,并计算风险分数。根据风险分数中位数将患者分为高风险组和低风险组,并分析2组预后情况。构建预后列线图模型和校准曲线,对该模型进行临床分组验证和相关性分析。基于HPA数据库分析模型蛋白表达情况,并对风险蛋白进行富集分析。选取20例初诊肺腺癌患者进行免疫组化和临床特征分析。
    结果 本研究筛选出5个与预后相关的蛋白,构建了风险蛋白模型。风险分数对肺腺癌患者的预后具有预测作用。该风险模型展现出较强且独立的预后预测能力。列线图模型在预测个体预后方面表现出较高的准确性。此外,风险模型及其计算出的风险分数与临床分期特征之间存在内在联系。HPA数据库分析表明, CD38、CD49B、ADAR1和cdc25C4在肺腺癌组织中显著高表达。20例临床标本验证了初诊肺腺癌远处转移患者的CD49B呈高表达,且对治疗较敏感。
    结论 蛋白质组学和转录组学联合分析肺腺癌预后标志物的结果较可靠。CD49B在肺腺癌中发挥重要作用,基于该基因构建的预后预测模型有望为临床治疗肺腺癌提供重要参考。

     

    Abstract:
    Objective To screen prognostic biomarkers for lung adenocarcinoma by integrating proteomics and transcriptomics.
    Methods Proteomics, transcriptomics and clinical characteristics data of lung adenocarcinoma patients were downloaded from the TCGA public database. The dataset was split into training set and validation set at a ratio of 7∶3. Univariate prognostic analysis of protein expression was conducted in the training set based on patients' clinical survival time, survival status, and protein expression data. A prognostic model for lung adenocarcinoma patients was constructed using the lasso-step cox method, and risk scores were calculated. Patients were divided into high-risk and low-risk groups based on the median risk score, and the prognosis of the two groups was analyzed. A prognostic nomogram model and calibration curves were constructed to clinically validate and correlate the model. The protein expression of the model was analyzed based on the HPA database, and enrichment analysis was performed on the risk proteins. Immunohistochemical and clinical characteristic analyses were conducted in 20 newly diagnosed lung adenocarcinoma patients from our hospital.
    Results Five proteins associated with prognosis were screened out, and a risk protein model was constructed. The risk score had a predictive effect on the prognosis of lung adenocarcinoma patients. The risk model demonstrated strong and independent prognostic predictive ability. The nomogram model showed high accuracy in predicting individual prognosis. Furthermore, there were intrinsic relationship of the risk model and its calculated risk scores with clinical staging characteristics. HPA database analysis revealed significant overexpression of CD38, CD49B, ADAR1, and cdc25C4 in lung adenocarcinoma tissues. The 20 clinical specimens from our hospital validated the high expression of CD49B in newly diagnosed lung adenocarcinoma patients with distant metastasis and its sensitivity to treatment.
    Conclusion The combined analysis of proteomics and transcriptomics for prognostic biomarkers of lung adenocarcinoma yields reliable results. CD49B plays a crucial role in lung adenocarcinoma, and the prognostic prediction model based on this gene is expected to provide important references for clinical treatment of lung adenocarcinoma.

     

/

返回文章
返回