A data-mining model for predicting low birth weight with a high AUC

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Birth weight is a significant determinant of a newborn’s probability of survival. Data-mining models are receiving considerable attention for identifying low birth weight risk factors. However, prediction of actual birth weight values based on the identified risk factors, which can play a significant role in the identification of mothers at the risk of delivering low birth weight infants, remains unsolved. This paper presents a study of data-mining models that predict the actual birth weight, with particular emphasis on achieving a higher area under the receiver operating characteristic (AUC). The prediction is based on birth data from the North Carolina State Center for Health Statistics of 2006. The steps followed to extract meaningful patterns from the data were data selection, handling missing values, handling imbalanced data, model building, feature selection, and model evaluation. Decision trees were used for classifying birth weight and tested on the actual imbalanced dataset and the balanced dataset using synthetic minority oversampling technique (SMOTE). The results highlighted that models built with balanced datasets using the SMOTE algorithm produce a relatively higher AUC compared to models built with imbalanced datasets. The J48 model built with balanced data outperformed REPTree and Random tree with an AUC of 90.3%, and thus it was selected as the best model. In conclusion, the feasibility of using J48 in birth weight prediction would offer the possibility to reduce obstetric-related complications and thus improving the overall obstetric health care.

Original languageEnglish
Title of host publicationComputer and Information Science
EditorsRoger Lee
PublisherSpringer Verlag
Pages109-121
Number of pages13
ISBN (Print)9783319601694
DOIs
Publication statusPublished - 2018
Event16th IEEE/ACIS International Conference on Computer and Information Science, ICIS 2017 - Wuhan, China
Duration: May 24 2017May 26 2017

Publication series

NameStudies in Computational Intelligence
Volume719
ISSN (Print)1860-949X

Conference

Conference16th IEEE/ACIS International Conference on Computer and Information Science, ICIS 2017
CountryChina
CityWuhan
Period5/24/175/26/17

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence

Fingerprint Dive into the research topics of 'A data-mining model for predicting low birth weight with a high AUC'. Together they form a unique fingerprint.

  • Cite this

    Hange, U., Selvaraj, R., Galani, M., & Letsholo, K. (2018). A data-mining model for predicting low birth weight with a high AUC. In R. Lee (Ed.), Computer and Information Science (pp. 109-121). (Studies in Computational Intelligence; Vol. 719). Springer Verlag. https://doi.org/10.1007/978-3-319-60170-0_8