Hepatitis C Detection from Blood Donor Data Using Hybrid Deep Feature Synthesis and Interpretable Machine Learning

dc.contributor.authorSafiul Haque Chowdhury
dc.contributor.authorMd Shafiul Alam Chowdhury
dc.contributor.authorMohammed Ibrahim Hussain
dc.contributor.authorMohammed Sowket Ali
dc.contributor.authorMuhammad Minoar Hossain
dc.contributor.authorMohammad Mamun
dc.date.accessioned2026-04-29T06:16:50Z
dc.date.issued2025-09-29
dc.description.abstractHepatitis is a severe liver inflammation that can lead to chronic disease, liver failure, and death if untreated. It is caused by viral infections, autoimmune disorders, or excessive alcohol consumption, with viral hepatitis (e.g., Hepatitis B and Hepatitis C) being particularly fatal. Early diagnosis and accurate prognosis are crucial for effective treatment. To address this, we employ advanced computational techniques for hepatitis classification and risk assessment, collecting a dataset of 615 samples with 12 biochemical features (e.g., Albumin, Bilirubin, and Cholesterol) obtained from blood donors. We apply the Synthetic Minority Over-sampling Technique (SMOTE) to handle class imbalance and implement Deep Feature Synthesis (DFS) with Aggregation Primitives (AP), Transformation Primitives (TP), and a Hybrid DFS approach to generate three feature-enhanced datasets. Multiple machine learning (ML) models, including Extreme Gradient Boosting (XGB), Random Forest (RF), Gradient Boosting Decision Trees (GBDT), Categorical Boosting (CB), and Adaptive Boosting (AB), are trained with and without DFS. Performance is evaluated using accuracy, precision, recall, F1-score, and specificity, from Confusion Matrix (CM) analysis via 10-fold cross-validation. The GBDT model with Hybrid DFS achieves the highest accuracy of 99.49%, along with 99.56% precision, 99.81% recall, 99.69% F1-score, and 99.12% specificity. To enhance interpretability, we apply Explainable Artificial Intelligence (XAI) techniques, namely Shapley Additive Explanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME), to analyze feature importance and model behaviour. The proposed Hybrid DFS-GBDT approach demonstrates high effectiveness and interpretability, offering a robust framework for hepatitis diagnosis and prognosis.
dc.identifier.citationChowdhury, Safiul Haque, et al. "Hepatitis C detection from blood donor data using hybrid deep feature synthesis and interpretable machine learning." 2025 2nd International Conference on Next-Generation Computing, IoT and Machine Learning (NCIM). IEEE, 2025.
dc.identifier.issn979-8-3315-5543-6
dc.identifier.urihttp://dspace.uttarauniversity.edu.bd:4000/handle/123456789/1423
dc.language.isoen_US
dc.publisher2025 2nd International Conference on Next-Generation Computing, IoT and Machine Learning, NCIM 2025
dc.subjectHepatitis C Detection
dc.subjectBlood Donor Data Analysis
dc.subjectHybrid Deep Feature Synthesis
dc.subjectMedical Data Classification
dc.titleHepatitis C Detection from Blood Donor Data Using Hybrid Deep Feature Synthesis and Interpretable Machine Learning
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Hepatitis C Detection from Blood Donor Data Using Hybrid Deep Feature Synthesis and Interpretable Machine Learning 276.pdf
Size:
98.63 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed to upon submission
Description:

Collections