×
Home Current Archive Editorial board
News Contact
Review paper

A systematic evaluation of big data-driven colorectal cancer studies

By
Eslam Bani Mohammad Orcid logo ,
Eslam Bani Mohammad

Department of Applied Science/Nursing, Al-Balqa Applied University,

Muayyad Ahmad
Muayyad Ahmad
Contact Muayyad Ahmad

Clinical Nursing Department, School of Nursing, University of Jordan, Jordan

Abstract

Aim
To assess machine-learning models, their methodological quality, compare their performance, and highlight their limitations.
Methods
The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) recommendations were applied. Electronic databases Science Direct, MEDLINE through (PubMed, Google Scholar), EBSCO, ERIC, and CINAHL were
searched for the period of January 2016 to September 2023. Using a pre-designed data extraction sheet, the review data were extracted. Big data, risk assessment, colorectal cancer, and artificial intelligence were the main terms.
Results
Fifteen studies were included. A total of 3,057,329 colorectal cancer (CRC) health records, including those of adult patients older than 18, were used to generate the results. The curve's area under the curve ranged from 0.704 to 0.976. Logistic regression, random forests, and colon flag were often employed techniques. Overall, these trials provide a considerable and accurate CRC risk prediction.
Conclusion
An up-to-date summary of recent research on the use of big data in CRC prediction was given. Future research can be
facilitated by the review's identification of gaps in the literature. Missing data, a lack of external validation, and the diversity of
machine learning algorithms are the current obstacles. Despite having a sound mathematical definition, area under the curve application depends on the modelling context. 

References

1.
Picard E, Verschoor CP, Ma GW, Pawelec G. Relationships between immune landscapes, genetic subtypes and responses to immunotherapy in colorectal cancer. Vol. 11, Front Immunol. 2020.
2.
Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. Vol. 71, CA Cancer J Clin. 2021. p. 209–49.
3.
Xi Y, Xu P. Global colorectal cancer burden in 2020 and projections to 2040. Vol. 14, Transl Oncol. 2021.
4.
Siegel RL, Wagle NS, Cercek A, Smith RA, Jemal A. Colorectal cancer statistics 2023. Vol. 73, CA Cancer J Clin. 2023. p. 233–54.
5.
W.H.O. WHO Report on Cancer: Setting Priorities, Investing Wisely and Providing Care for All. 2020.
6.
Sawicki T, Ruszkowska M, Danielewicz A, Niedźwiedzka E, Arłukowicz T, Przybyłowicz KE. A review of colorectal cancer in terms of epidemiology, risk factors, development, symptoms and diagnosis. Vol. 13, Cancers. 2021.
7.
Kanth P, Inadomi JM. Screening and prevention of colorectal cancer. BMJ. 2021. p. 374.
8.
Ahmad M, Alhalaiqa F, Subih M. Constructing and testing the psychometrics of an instrument to measure the attitudes, benefits, and threats associated with the use of Artificial Intelligence tools in higher education. Vol. 6, JALT. 2023. p. 114–20.
9.
Hani SB, Ahmad M. Effective prediction of mortality by heart disease among women in jordan using the Chi-Squared Automatic Interaction Detection Model: retrospective validation study. Vol. 7:e48795, JMIR Cardio. 2023.
10.
Ahmad M, Hani SHB, Sabra MA, Almahmoud O. Big data can help prepare nurses and improve patient outcomes by improving quality, safety, and outcomes. Vol. 10, Front Nurs. 2023. p. 241–8.
11.
Agrawal R, Prabakaran S. Big data in digital healthcare: lessons learnt and recommendations for general practice. Vol. 124, Heredity. 2020. p. 525–34.
12.
Seow H, Tanuseputro P, Barbera L, Earle CC, Guthrie DM, Isenberg J, et al. Development and validation of a prediction model of poor performance status and severe symptoms over time in cancer patients (PROVIEW+. Vol. 35, Palliat Med. 2021. p. 1713–23.
13.
Hani HSB, Ahmad MM. Large-scale data in health care: a concept analysis. Vol. 325, Georgian Med News. 2022. p. 33–6.
14.
Zhang L, Zheng C, Li T, Xing L, Zeng H, Li T, et al. Building up a robust risk mathematical platform to predict colorectal cancer 2017. Complexity. 2017.
15.
Nartowt BJ, Hart GR, Muhammad W, Liang Y, Stark GF, Deng J. Robust machine learning for colorectal cancer risk prediction and stratification. Vol. 3, Front. Big Data. 2020.
16.
Morin L, Onwuteaka-Philipsen BD. The promise of big data for palliative and end-of-life care research. Vol. 35, In. 2021. p. 1638–40.
17.
Oussous A, Benjelloun FZ, Lahcen AA, Belfkih S. Big Data technologies: A survey. Vol. 30, J King Saud Univ - Comput Inf Sci. 2018. p. 431–48.
18.
Park H, Kang Y. AI-Big Data-Mobile System development of measuring nursing workloads using wearable device and real time location information. 2023.
19.
Knevel R, Liao KP. From real-world electronic health record data to real-world results using artificial intelligence. Vol. 82, Ann Rheum Dis. 2023. p. 306–11.
20.
Palanisamy V, Thirunavukarasu R. Implications of big data analytics in developing healthcare frameworks–A review. Vol. 31, J King Saud Univ - Comput Inf Sci. 2019. p. 415–25.
21.
Moher D, Liberati A, Tetzlaff J, Altman D. Preferred Reporting items for Systematic and Meta-Analysis (PRISMA) Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Vol. 8, Int J Surg. 2010. p. 336–41.
22.
Wolff RF, Moons KG, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Vol. 170, Ann Intern Med. 2019. p. 51–8.
23.
Aromataris E, Fernandez R, Godfrey CM, Holly C, Khalil H, Tungpunkom P. Summarizing systematic reviews: methodological development, conduct and reporting of an umbrella review approach. Vol. 13, Int J Evid Based Healthc. 2015. p. 132–40.
24.
Birks J, Bankhead C, Holt TA, Fuller A, Patnick J. Evaluation of a prediction model for colorectal cancer: retrospective analysis of 2.5 million patient records. Vol. 6, Cancer Med. 2017. p. 2453–60.
25.
L BC, V CP, S VP, MP C, GA F, V WF, et al. Machine learning for predicting survival of colorectal cancer patients. Vol. 13, Sci Rep. 2023.
26.
Gu Y, Duan B, Sha J, Zhang R, Fan J, Xu X, et al. Serum IgG N‐glycans enable early detection and early relapse prediction of colorectal cancer. Vol. 152, Int J Cancer. 2023. p. 536–47.
27.
Hilsden RJ, Heitman SJ, Mizrahi B, Narod SA, Goshen R. Prediction of findings at screening colonoscopy using a machine learning algorithm based on complete blood counts (ColonFlag. Vol. 13:e0207848, PLoS One. 2018.
28.
Hornbrook MC, Goshen R, Choman E, O’Keeffe-Rosetti M, Kinar Y, Liles EG, et al. Early colorectal cancer detected by machine learning model using gender, age, and complete blood count data. Vol. 62, Dig Dis Sci. 2017. p. 2719–27.
29.
Hoogendoorn M, Szolovits P, Moons LM, Numans ME. Utilizing uncoded consultation notes from electronic medical records for predictive modeling of colorectal cancer. Vol. 69, AIM. 2016. p. 53–61.
30.
Lee E, Jung SY, Hwang HJ, Jung J. Patient-level cancer prediction models from a nationwide patient cohort: model development and validation.
31.
Leonard G, South C, Balentine C, Porembka M, Mansour J, Wang S, et al. Machine learning improves prediction over logistic regression on resected colon cancer patients. Vol. 275, J Surg Res. 2022. p. 181–93.
32.
Liu C, Wang T, Yang J, Zhang J, Wei S, Guo Y, et al. Distant metastasis pattern and prognostic prediction model of colorectal cancer patients based on big data mining. Vol. 12, Front Oncol. 2022.
33.
Nakanishi R, Morooka KI, Omori K, Toyota S, Tanaka Y, Hasuda H, et al. Artificial intelligence-based prediction of recurrence after curative resection for colorectal cancer from digital pathological images. Vol. 30, Ann Surg Oncol. 2023. p. 3506–14.
34.
Susič D, Syed-Abdul S, Dovgan E, Jonnagaddala J, Gradišek A. Artificial intelligence based personalized predictive survival among colorectal cancer patients. Vol. 231, Comput Methods Programs Biomed. 2023.
35.
Tan L, Li H, Yu J, Zhou H, Wang Z, Niu Z, et al. Colorectal cancer lymph node metastasis prediction with weakly supervised transformer-based multi-instance learning. Vol. 61, Med Biol Eng Comput. 2023. p. 1565–80.
36.
Tsai PC, Lee TH, Kuo KC, Su FY, Lee TL, Marostica E, et al. Histopathology images predict multi-omics aberrations and prognoses in colorectal cancer patients. Vol. 14, Nat Commun. 2023.
37.
Virdee PS, Patnick J, Watkinson P, Holt T, Birks J. Full blood count trends for colorectal cancer detection in primary care: development and validation of a dynamic prediction model. Vol. 14, Cancers. 2022.
38.
Joloudari JH, Saadatfar H, Dehzangi A, Shamshirband S. Computer-aided decision-making for predicting liver disease using PSO-based optimized SVM with feature selection. Vol. 17, IMU. 2019.
39.
Liu B, Udell M. Impact of accuracy on model interpretations. In: arXiv CS - Machine Learning 2020. p. 201109903.
40.
Muschelli J. ROC and AUC with a binary predictor: a potentially misleading metric. Vol. 37, J Classif. 2020. p. 696–708.
41.
Melo F. Area under the ROC Curve. Encyclopedia of systems biology. 2013. p. 38–9.
42.
Wang Y, He X, Nie H, Zhou J, Cao P, Ou C. Application of artificial intelligence to the diagnosis and therapy of colorectal cancer. Vol. 10, Am. J Cancer Res. 2020.
43.
Costa VG, Pedreira CE. Recent advances in decision trees: An updated survey. Vol. 56, Artif Intell Rev. 2023. p. 4765–800.
44.
Biau G. Analysis of a random forests model. Vol. 13, JMLR. 2012. p. 1063–95.
45.
Speiser JL, Miller ME, Tooze J, Ip E. A comparison of random forest variable selection methods for classification prediction modeling. Vol. 134, Expert Syst Appl. 2019. p. 93–101.
46.
Connelly L. Logistic regression. Vol. 29, Medsurg Nurs. 2020. p. 353–4.
47.
Nusinovici S, Tham YC, Yan MY, Ting DS, Li J, Sabanayagam C, et al. Logistic regression was as good as machine learning for predicting major chronic diseases. Vol. 122, JCE. 2020. p. 56–69.
48.
Mangal S, Chaurasia A, Khajanchi A. Convolution neural networks for diagnosing colon and lung cancer histopathological images.
49.
SH BH, MM A. Machine-learning Algorithms for Ischemic Heart Disease Prediction: A Systematic Review. Vol. 19, Curr Cardiol Rev. 2023. p. 87–99.
50.
Jones OT, Matin RN, Schaar M, Bhayankaram KP, Ranmuthu CK, Islam MS, et al. Artificial intelligence and machine learning algorithms for early detection of skin cancer in community and primary care settings: a systematic review. Vol. 4:e466-e476, Lancet Digit Health. 2022.
51.
Dlamini Z, Francies FZ, Hull R, Marima R. Artificial intelligence (AI) and big data in cancer and precision oncology. Vol. 18, CSBJ. 2020. p. 2300–11.
52.
Yu C, Helwig EJ. The role of AI technology in prediction, diagnosis and treatment of colorectal cancer. Vol. 2022, Artif Intell Rev. p. 1–21.
53.
Nwosu AC, Collins B, Mason S. Big data analysis to improve care for people living with serious illness: the potential to use new emerging technology in palliative care. Vol. 32, Palliat Med. 2018. p. 164–6.
54.
Sammour F, Alkailani H, Sweis GJ, Sweis RJ, Maaitah W, Alashkar A. Forecasting demand in the residential construction industry using machine learning algorithms in Jordan. Constr Innov. 2023.
55.
M EK, S H, I AA, H AZ, R AM. Digital disruption and big data in healthcare-opportunities and challenges. CEOR. 2022. p. 563–74.
56.
Bragazzi NL, Dai H, Damiani G, Behzadifar M, Martini M, Wu J. How big data and artificial intelligence can help better manage the COVID-19 pandemic. Vol. 17, IJERPH. 2020.
57.
Stefanicka-Wojtas D, Kurpas D. eHealth and mHealth in Chronic Diseases—identification of barriers, existing solutions, and promoters based on a survey of EU stakeholders involved in Regions4PerMed (H2020. Vol. 12, J Pers Med. 2022.
58.
Ruiters S, Mombaerts I. Applications of three-dimensional printing in orbital diseases and disorders. Vol. 30, Curr Opin Ophthalmol. 2019. p. 372–9.
59.
Storick V, O’Herlihy A, Abdelhafeez S, Ahmed R, May P. Improving palliative care with machine learning and routine data: a rapid review. HRB Open Research. 2019. p. 2.
60.
Srivastava D, Pandey H, Agarwal AK. Complex predictive analysis for health care: a comprehensive review. Vol. 12, BEEI. 2023. p. 521–31.
61.
Sangaiah AK, Rezaei S, Javadpour A, Zhang W. Explainable AI in big data intelligence of community detection for digitalization e-healthcare services. Vol. 136, Appl Soft Comput. 2023.
62.
Pastorino R, Vito C, Migliara G, Glocker K, Binenbaum I, Ricciardi W, et al. Benefits and challenges of Big Data in healthcare: an overview of the European initiatives. Vol. 29, Eur J Public Health. 2019. p. 23–7.
63.
Price WN, Cohen IG. Privacy in the age of medical big data. Vol. 25, Nat Med. 2019. p. 37–43.
64.
Alboaneen D, Alqarni R, Alqahtani S, Alrashidi M, Alhuda R, Alyahyan E, et al. Predicting colorectal cancer using machine and deep learning algorithms: challenges and opportunities. Vol. 7, BDCC. 2023.
65.
Ahmad M, Hani SHB, Sabra MA, Almahmoud O. Big data can help prepare nurses and improve patient outcomes by improving quality, safety, and outcomes. Vol. 10, Front Nurs. 2023. p. 241–8.
66.
Lewandowska A, Rudzki G, Lewandowski T, Stryjkowska-Góra A, Rudzki S. Risk factors for the diagnosis of colorectal cancer. Vol. 29, Cancer Control. 2022.
67.
Cervantes A, Adam R, Roselló S, Arnold D, Normanno N, Taïeb J, et al. Metastatic colorectal cancer: ESMO Clinical Practice Guideline for diagnosis, treatment and follow-up. Vol. 34, Ann Oncol. 2023. p. 10–32.
68.
Society AC. Colorectal cancer facts & figures 2020–2022. Vol. 66, Atlanta Am Cancer Soc. 2020. p. 1–41.
69.
Amarin JZ, Mansour R, Nimri OF, Al-Hussaini M. Incidence of cancer in adolescents and young adults in Jordan, 2000-2017. Vol. 7, JCO Glob Oncol. 2021. p. 934–46.
70.
Sharkas GF, Arqoub KH, Khader YS, Tarawneh MR, Nimri OFN, OF AZ, et al. Colorectal cancer in Jordan: survival rate and its related factors. J Oncol. 2017.
71.
Bazira PJ. Anatomy of the caecum, appendix, and colon. Vol. 41, Surgery (Oxford. 2022. p. 1–6.
72.
Essentials of Visceral Surgery: For Residents and Fellows. 2023.
73.
Awad H, Abu-Shanab A, Hammad N, Atallah A, Abdulattif M. Demographic features of patients with colorectal carcinoma based on 14 years of experience at Jordan University Hospital. Vol. 38, Ann Saudi Med. 2018. p. 427–32.
74.
Society AC. Colorectal cancer, early detection, diagnosis, and staging. 2023.
75.
Sun Y, Fan X, Zhao J. Development of colorectal cancer detection and prediction based on gut microbiome big-data. Vol. 12, Med Microecol. 2022.

Citation

Authors retain copyright. This work is licensed under a Creative Commons Attribution 4.0 International License. Creative Commons License

 

Article metrics

Google scholar: See link

The statements, opinions and data contained in the journal are solely those of the individual authors and contributors and not of the publisher and the editor(s). We stay neutral with regard to jurisdictional claims in published maps and institutional affiliations.