×
Home Current Archive Editorial board
News Contact
Review paper

A systematic evaluation of big data-driven colorectal cancer studies

By
Eslam Bani Mohammad Orcid logo ,
Eslam Bani Mohammad

Department of Applied Science/Nursing, Al-Balqa Applied University

Muayyad Ahmad
Muayyad Ahmad
Contact Muayyad Ahmad

Clinical Nursing Department, School of Nursing, University of Jordan Jordan

Abstract

Aim
To assess machine-learning models, their methodological quality, compare their performance, and highlight their limitations.
Methods
The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) recommendations were applied. Electronic databases Science Direct, MEDLINE through (PubMed, Google Scholar), EBSCO, ERIC, and CINAHL were
searched for the period of January 2016 to September 2023. Using a pre-designed data extraction sheet, the review data were extracted. Big data, risk assessment, colorectal cancer, and artificial intelligence were the main terms.
Results
Fifteen studies were included. A total of 3,057,329 colorectal cancer (CRC) health records, including those of adult patients older than 18, were used to generate the results. The curve's area under the curve ranged from 0.704 to 0.976. Logistic regression, random forests, and colon flag were often employed techniques. Overall, these trials provide a considerable and accurate CRC risk prediction.
Conclusion
An up-to-date summary of recent research on the use of big data in CRC prediction was given. Future research can be
facilitated by the review's identification of gaps in the literature. Missing data, a lack of external validation, and the diversity of
machine learning algorithms are the current obstacles. Despite having a sound mathematical definition, area under the curve application depends on the modelling context. 

References

1
Nusinovici S, Tham YC, Yan MY, Ting DS, Li J, Sabanayagam C, et al. Logistic regression was as good as machine learning for predicting major chronic diseases. JCE 2020;122:56–69.
2
Bragazzi NL, Dai H, Damiani G, Behzadifar M, Martini M, Wu J. How big data and artificial intelligence can help better manage the COVID-19 pandemic. IJERPH 2020;17;(3176).
3
M EK, S H, I AA, H AZ, R AM. Digital disruption and big data in healthcare-opportunities and challenges. CEOR 2022:563–74.
4
Sammour F, Alkailani H, Sweis GJ, Sweis RJ, Maaitah W, Alashkar A. Forecasting demand in the residential construction industry using machine learning algorithms in Jordan. Constr Innov 2023.
5
Nwosu AC, Collins B, Mason S. Big data analysis to improve care for people living with serious illness: the potential to use new emerging technology in palliative care. Palliat Med 2018;32:164–6.
6
Yu C, Helwig EJ. The role of AI technology in prediction, diagnosis and treatment of colorectal cancer. Artif Intell Rev n.d.;2022:1–21.
7
Dlamini Z, Francies FZ, Hull R, Marima R. Artificial intelligence (AI) and big data in cancer and precision oncology. CSBJ 2020;18:2300–11.
8
Jones OT, Matin RN, Schaar M, Bhayankaram KP, Ranmuthu CK, Islam MS, et al. Artificial intelligence and machine learning algorithms for early detection of skin cancer in community and primary care settings: a systematic review. Lancet Digit Health 2022;4:e466-e476.
9
SH BH, MM A. Machine-learning Algorithms for Ischemic Heart Disease Prediction: A Systematic Review. Curr Cardiol Rev 2023;19:87–99.
10
Mangal S, Chaurasia A, Khajanchi A. Convolution neural networks for diagnosing colon and lung cancer histopathological images n.d.
11
Stefanicka-Wojtas D, Kurpas D. eHealth and mHealth in Chronic Diseases—identification of barriers, existing solutions, and promoters based on a survey of EU stakeholders involved in Regions4PerMed (H2020. J Pers Med 2022;12;(467).
12
Connelly L. Logistic regression. Medsurg Nurs 2020;29:353–4.
13
Speiser JL, Miller ME, Tooze J, Ip E. A comparison of random forest variable selection methods for classification prediction modeling. Expert Syst Appl 2019;134:93–101.
14
Biau G. Analysis of a random forests model. JMLR 2012;13:1063–95.
15
Costa VG, Pedreira CE. Recent advances in decision trees: An updated survey. Artif Intell Rev 2023;56:4765–800.
16
Wang Y, He X, Nie H, Zhou J, Cao P, Ou C. Application of artificial intelligence to the diagnosis and therapy of colorectal cancer. Am J Cancer Res 2020;10;(3575).
17
Melo F. Area under the ROC Curve. Encyclopedia of Systems Biology 2013:38–9.
18
Muschelli J. ROC and AUC with a binary predictor: a potentially misleading metric. J Classif 2020;37:696–708.
19
Liu B, Udell M. Impact of accuracy on model interpretations. arXiv CS - Machine Learning 2020, n.d., p. 201109903.
20
Lewandowska A, Rudzki G, Lewandowski T, Stryjkowska-Góra A, Rudzki S. Risk factors for the diagnosis of colorectal cancer. Cancer Control 2022;29;(10732748211056692).
21
Sun Y, Fan X, Zhao J. Development of colorectal cancer detection and prediction based on gut microbiome big-data. Med Microecol 2022;12;(100053).
22
Society AC. Colorectal cancer, early detection, diagnosis, and staging 2023.
23
Awad H, Abu-Shanab A, Hammad N, Atallah A, Abdulattif M. Demographic features of patients with colorectal carcinoma based on 14 years of experience at Jordan University Hospital. Ann Saudi Med 2018;38:427–32.
24
Essentials of Visceral Surgery: For Residents and Fellows. 2023.
25
Bazira PJ. Anatomy of the caecum, appendix, and colon. Surgery (Oxford 2022;41:1–6.
26
Sharkas GF, Arqoub KH, Khader YS, Tarawneh MR, Nimri OFN, OF A-Z, et al. Colorectal cancer in Jordan: survival rate and its related factors. J Oncol 2017.
27
Amarin JZ, Mansour R, Nimri OF, Al-Hussaini M. Incidence of cancer in adolescents and young adults in Jordan, 2000-2017. JCO Glob Oncol 2021;7:934–46.
28
Society AC. Colorectal cancer facts & figures 2020–2022. Atlanta Am Cancer Soc 2020;66:1–41.
29
Cervantes A, Adam R, Roselló S, Arnold D, Normanno N, Taïeb J, et al. Metastatic colorectal cancer: ESMO Clinical Practice Guideline for diagnosis, treatment and follow-up. Ann Oncol 2023;34:10–32.
30
Joloudari JH, Saadatfar H, Dehzangi A, Shamshirband S. Computer-aided decision-making for predicting liver disease using PSO-based optimized SVM with feature selection. IMU 2019;17;(100255).
31
Ahmad M, Hani SHB, Sabra MA, Almahmoud O. Big data can help prepare nurses and improve patient outcomes by improving quality, safety, and outcomes. Front Nurs 2023;10:241–8.
32
Alboaneen D, Alqarni R, Alqahtani S, Alrashidi M, Alhuda R, Alyahyan E, et al. Predicting colorectal cancer using machine and deep learning algorithms: challenges and opportunities. BDCC 2023;7;(74).
33
Price WN, Cohen IG. Privacy in the age of medical big data. Nat Med 2019;25:37–43.
34
Pastorino R, Vito C, Migliara G, Glocker K, Binenbaum I, Ricciardi W, et al. Benefits and challenges of Big Data in healthcare: an overview of the European initiatives. Eur J Public Health 2019;29;(Suppl 3):23–7.
35
Sangaiah AK, Rezaei S, Javadpour A, Zhang W. Explainable AI in big data intelligence of community detection for digitalization e-healthcare services. Appl Soft Comput 2023;136;(110119).
36
Srivastava D, Pandey H, Agarwal AK. Complex predictive analysis for health care: a comprehensive review. BEEI 2023;12:521–31.
37
Storick V, O’Herlihy A, Abdelhafeez S, Ahmed R, May P. Improving palliative care with machine learning and routine data: a rapid review. HRB Open Research 2019:2.
38
Ruiters S, Mombaerts I. Applications of three-dimensional printing in orbital diseases and disorders. Curr Opin Ophthalmol 2019;30:372–9.
39
Ahmad M, Hani SHB, Sabra MA, Almahmoud O. Big data can help prepare nurses and improve patient outcomes by improving quality, safety, and outcomes. Front Nurs 2023;10:241–8.
40
Knevel R, Liao KP. From real-world electronic health record data to real-world results using artificial intelligence. Ann Rheum Dis 2023;82:306–11.
41
Park H, Kang Y. AI-Big Data-Mobile System development of measuring nursing workloads using wearable device and real time location information 2023. doi: 10.21203/rs3rs-2802548/v1.
42
Oussous A, Benjelloun F-Z, Lahcen AA, Belfkih S. Big Data technologies: A survey. J King Saud Univ - Comput Inf Sci 2018;30:431–48.
43
Morin L, Onwuteaka-Philipsen BD. The promise of big data for palliative and end-of-life care research. In 2021;35:1638–40.
44
Nartowt BJ, Hart GR, Muhammad W, Liang Y, Stark GF, Deng J. Robust machine learning for colorectal cancer risk prediction and stratification. Front Big Data 2020;3;(6).
45
Zhang L, Zheng C, Li T, Xing L, Zeng H, Li T, et al. Building up a robust risk mathematical platform to predict colorectal cancer 2017. Complexity 2017. doi: 10.1155/2017/8917258.
46
Hani HSB, Ahmad MM. Large-scale data in health care: a concept analysis. Georgian Med News 2022;325:33–6.
47
Seow H, Tanuseputro P, Barbera L, Earle CC, Guthrie DM, Isenberg J, et al. Development and validation of a prediction model of poor performance status and severe symptoms over time in cancer patients (PROVIEW+. Palliat Med 2021;35:1713–23.
48
Agrawal R, Prabakaran S. Big data in digital healthcare: lessons learnt and recommendations for general practice. Heredity 2020;124:525–34.
49
Palanisamy V, Thirunavukarasu R. Implications of big data analytics in developing healthcare frameworks–A review. J King Saud Univ - Comput Inf Sci 2019;31:415–25.
50
Hani SB, Ahmad M. Effective prediction of mortality by heart disease among women in jordan using the Chi-Squared Automatic Interaction Detection Model: retrospective validation study. JMIR Cardio 2023;7:e48795.
51
Ahmad M, Alhalaiqa F, Subih M. Constructing and testing the psychometrics of an instrument to measure the attitudes, benefits, and threats associated with the use of Artificial Intelligence tools in higher education. JALT 2023;6:114–20.
52
Kanth P, Inadomi JM. Screening and prevention of colorectal cancer. BMJ 2021:374.
53
Sawicki T, Ruszkowska M, Danielewicz A, Niedźwiedzka E, Arłukowicz T, Przybyłowicz KE. A review of colorectal cancer in terms of epidemiology, risk factors, development, symptoms and diagnosis. Cancers 2021;13;(2025).
54
W.H.O. WHO Report on Cancer: Setting Priorities, Investing Wisely and Providing Care for All 2020.
55
Siegel RL, Wagle NS, Cercek A, Smith RA, Jemal A. Colorectal cancer statistics 2023. CA Cancer J Clin 2023;73:233–54.
56
Xi Y, Xu P. Global colorectal cancer burden in 2020 and projections to 2040. Transl Oncol 2021;14;(101174).
57
Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2021;71:209–49.
58
Hoogendoorn M, Szolovits P, Moons LM, Numans ME. Utilizing uncoded consultation notes from electronic medical records for predictive modeling of colorectal cancer. AIM 2016;69:53–61.
59
Virdee PS, Patnick J, Watkinson P, Holt T, Birks J. Full blood count trends for colorectal cancer detection in primary care: development and validation of a dynamic prediction model. Cancers 2022;14;(4779).
60
Tsai PC, Lee TH, Kuo KC, Su FY, Lee TL, Marostica E, et al. Histopathology images predict multi-omics aberrations and prognoses in colorectal cancer patients. Nat Commun 2023;14;(2102).
61
Tan L, Li H, Yu J, Zhou H, Wang Z, Niu Z, et al. Colorectal cancer lymph node metastasis prediction with weakly supervised transformer-based multi-instance learning. Med Biol Eng Comput 2023;61:1565–80.
62
Susič D, Syed-Abdul S, Dovgan E, Jonnagaddala J, Gradišek A. Artificial intelligence based personalized predictive survival among colorectal cancer patients. Comput Methods Programs Biomed 2023;231;(107435).
63
Nakanishi R, Morooka KI, Omori K, Toyota S, Tanaka Y, Hasuda H, et al. Artificial intelligence-based prediction of recurrence after curative resection for colorectal cancer from digital pathological images. Ann Surg Oncol 2023;30:3506–14.
64
Liu C, Wang T, Yang J, Zhang J, Wei S, Guo Y, et al. Distant metastasis pattern and prognostic prediction model of colorectal cancer patients based on big data mining. Front Oncol 2022;12;(878805).
65
Leonard G, South C, Balentine C, Porembka M, Mansour J, Wang S, et al. Machine learning improves prediction over logistic regression on resected colon cancer patients. J Surg Res 2022;275:181–93.
66
Lee E, Jung SY, Hwang HJ, Jung J. Patient-level cancer prediction models from a nationwide patient cohort: model development and validation n.d.
67
Picard E, Verschoor CP, Ma GW, Pawelec G. Relationships between immune landscapes, genetic subtypes and responses to immunotherapy in colorectal cancer. Front Immunol 2020;11;(369).
68
Hornbrook MC, Goshen R, Choman E, O’Keeffe-Rosetti M, Kinar Y, Liles EG, et al. Early colorectal cancer detected by machine learning model using gender, age, and complete blood count data. Dig Dis Sci 2017;62:2719–27.
69
Hilsden RJ, Heitman SJ, Mizrahi B, Narod SA, Goshen R. Prediction of findings at screening colonoscopy using a machine learning algorithm based on complete blood counts (ColonFlag. PLoS One 2018;13:e0207848.
70
Gu Y, Duan B, Sha J, Zhang R, Fan J, Xu X, et al. Serum IgG N‐glycans enable early detection and early relapse prediction of colorectal cancer. Int J Cancer 2023;152:536–47.
71
L BC, V CP, S VP, MP C, GA F, V WF, et al. Machine learning for predicting survival of colorectal cancer patients. Sci Rep 2023;13;(8874).
72
Birks J, Bankhead C, Holt TA, Fuller A, Patnick J. Evaluation of a prediction model for colorectal cancer: retrospective analysis of 2.5 million patient records. Cancer Med 2017;6:2453–60.
73
Aromataris E, Fernandez R, Godfrey CM, Holly C, Khalil H, Tungpunkom P. Summarizing systematic reviews: methodological development, conduct and reporting of an umbrella review approach. Int J Evid Based Healthc 2015;13:132–40.
74
Wolff RF, Moons KG, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med 2019;170:51–8.
75
Moher D, Liberati A, Tetzlaff J, Altman D. Preferred Reporting items for Systematic and Meta-Analysis (PRISMA) Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Int J Surg 2010;8:336–41.

Citation

Authors retain copyright. This work is licensed under a Creative Commons Attribution 4.0 International License. Creative Commons License

 

Article metrics

Google scholar: See link

The statements, opinions and data contained in the journal are solely those of the individual authors and contributors and not of the publisher and the editor(s). We stay neutral with regard to jurisdictional claims in published maps and institutional affiliations.