News Modeling and Retrieving Information: Data-Driven Approach

dc.contributor.authorHossain, E.
dc.contributor.authorAlshahrani, A.
dc.contributor.authorRahman, W.
dc.date.accessioned2025-05-05T06:28:28Z
dc.date.issued2024-02-05
dc.description.abstractThis paper aims to develop Machine Learning algorithms to classify electronic articles related to this phenomenon by retrieving information and topic modelling. The Methodology of this study is categorized into three phases: the Text Classification Approach (TCA), the Proposed Algorithms Interpretation (PAI), and finally, Information Retrieval Approach (IRA). The TCA reflects the text preprocessing pipeline called a clean corpus. The Global Vectors for Word Representation (Glove) pre-trained model, FastText, Term Frequency-Inverse Document Frequency (TF-IDF), and Bag-of-Words (BOW) for extracting the features have been interpreted in this research. The PAI manifests the Bidirectional Long Short-Term Memory (Bi-LSTM) and Convolutional Neural Network (CNN) to classify the COVID-19 news. Again, the IRA explains the mathematical interpretation of Latent Dirich-let Allocation (LDA), obtained for modelling the topic of Information Retrieval (IR). In this study, 99% accuracy was obtained by performing K-fold cross-validation on Bi-LSTM with Glove. A comparative analysis between Deep Learning and Machine Learning based on feature extraction and computational complexity exploration has been performed in this research. Furthermore, some text analyses and the most influential aspects of each document have been explored in this study. We have utilized Bidirectional Encoder Representations from Trans-formers (BERT) as a Deep Learning mechanism in our model training, but the result has not been uncovered satisfactory. However, the proposed system can be adjustable in the real-time news classification of COVID-19
dc.identifier.citationHossain, E., Alshahrani, A., & Rahman, W. (2023). News Modeling and Retrieving Information: Data-Driven Approach. Intelligent Automation & Soft Computing, 38(2).
dc.identifier.issn10798587
dc.identifier.urihttp://dspace.uttarauniversity.edu.bd:4000/handle/123456789/674
dc.language.isoen
dc.publisherTech Science Press
dc.subjectCOVID-19
dc.subjectnews retrieving
dc.subjectdata-driven
dc.subjectmachine learning
dc.subjectBERT
dc.subjecttopic modelling
dc.titleNews Modeling and Retrieving Information: Data-Driven Approach
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TSP_IASC_29511.pdf
Size:
1.3 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed to upon submission
Description:

Collections