Predicting sleep quality using random forest on sleep health and lifestyle data

Naia Az - Zahra, Deri Latika Herda

Abstract


Sleep is an important physiological process that plays a role in maintaining the balance of biological and psychological functions. Lifestyle changes, such as high stress levels and a lack of physical activity, can affect a person’s sleep quality. This study aims to analyze the influence of health and lifestyle factors on sleep quality and to develop a predictive model for sleep quality using the Random Forest algorithm. This study uses the Sleep Health and Lifestyle dataset with a classification approach into two categories, namely Ideal Sleep and Non-Ideal Sleep, determined based on sleep duration parameters referring to the concept of a U-shaped relationship and the sleep duration recommendations from the National Sleep Foundation. The data were processed through preprocessing and class imbalance handling using the SMOTE method, then split into training and testing data. The Random Forest model was built through hyperparameter tuning and evaluated using accuracy and Area Under the Curve (AUC) metrics. The results show that the Random Forest model achieved good classification performance with an Accuracy of 91.26%, Precision of 91.78%, Recall of 91.26%, and F1-Score of 91.30%. In addition, the model obtained an Area Under the Curve (AUC) value of 0.962, indicating very good classification capability. Based on the Feature Importance analysis results, the features with the greatest influence on sleep quality are Heart Rate, Stress Level, Physical Activity, and Daily Steps. The findings indicate that the combination of the SMOTE method and Random Forest is effective for predicting sleep quality based on health and lifestyle factors.


Keywords


Sleep Quality; Machine Learning; Random Forest; Prediction

Full Text:

PDF

References


M. Maulidah and N. Hidayati, “Prediksi Kesehatan Tidur Dan Gaya Hidup Menggunakan Machine Learning,” CONTEN Comput. Netw., vol. 4, no. 1, pp. 81–86, 2024, [Online]. Available:http://jurnal.bsi.ac.id/index.php/conten/article/view/4918%0Ahttp://jurnal.bsi.ac.id/index.php/conten/article/download/4918/1759

M. A. Alnawwar, M. I. Alraddadi, R. A. Algethmi, G. A. Salem, M. A. Salem, and A. A. Alharbi, “The Effect of Physical Activity on Sleep Quality and Sleep Disorder: A Systematic Review,” Cureus, vol. 15, no. 8, 2023, doi: 10.7759/cureus.43595.

G. A. M. Ashfania, T. Prahasto, A. Widodo, and T. Warsokusumo, “Penggunaan Algoritma Random Forest untuk Klasifikasi berbasis Kinerja Efisiensi Energi pada Sistem Pembangkit Daya,” Rotasi.

N. Khasanah, D. U. Eka Saputri, F. Aziz, and T. Hidayat, “Studi Perbandingan Algoritma Random Forest dan K-Nearest Neighbors (KNN) dalam Klasifikasi Gangguan Tidur,” Comput. Sci., vol. 5, no. 1, pp. 17–25, 2025, doi: 10.31294/coscience.v5i1.5522.

M. Du, M. Liu, and J. Liu, “U-shaped association between sleep duration and the risk of respiratory diseases mortality: a large prospective cohort study from UK Biobank,” J. Clin. Sleep Med., vol. 19, no. 11, pp. 1923–1932, 2023, doi: 10.5664/jcsm.10732.

M. Hirshkowitz et al., “National sleep foundation’s sleep time duration recommendations: Methodology and results summary,” Sleep Heal., vol. 1, no. 1, pp. 40–43, 2015, doi: 10.1016/j.sleh.2014.12.010.

L. Tharmalingam, “Sleep Health and Lifestyle Dataset.” [Online]. Available: https://www.kaggle.com/datasets/uom190346a/sleep-health-and-lifestyle-dataset

M. P. Pulungan, A. Purnomo, and A. Kurniasih, “Penerapan SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Kepribadian MBTI Menggunakan Naive Bayes Classifier,” J. Teknol. Inf. dan Ilmu Komput., vol. 11, no. 5, pp. 1033–1042, 2024, doi: 10.25126/jtiik.2024117989.

A. Algiffary and T. Sutabri, “Analisis Random Forest Menggunakan Principal Component Analysis Pada Data Berdimensi Tinggi,” Indones. J. Comput. Sci., vol. 12, no. 2, pp. 284–301, 2023,[Online].Available:http://ijcs.stmikindonesia.ac.id/ijcs/index.php/ijcs/article/view/3135

V. S. Prakash, S. N. Bushra, N. Subramanian, D. Indumathy, S. A. L. Mary, and R. Thiagarajan, “Random forest regression with hyper parameter tuning for medical insurance premium prediction,” Int. J. Health Sci. (Qassim)., vol. 6, no. June, pp. 7093–7101, 2022, doi: 10.53730/ijhs.v6ns6.11762.

H. Hairani, A. Anggrawan, and D. Priyanto, “Improvement Performance of the Random Forest Method on Unbalanced Diabetes Data Classification Using Smote-Tomek Link,” Int. J. INFORMATICS Vis., vol. 7, no. March, pp. 258–264, 2023, [Online]. Available: www.joiv.org/index.php/joiv




DOI: https://doi.org/10.52626/joge.v5i1.83

Refbacks

  • There are currently no refbacks.


Journal Geuthee of Engineering and Energy is published by Geuthèë Institute.
St. Teknik II, Reumpet, Krueng Barona Jaya sub-district (23370), Aceh Besar District, Aceh Province, Indonesia.
http://geutheeinstitute.com/
ISSN (Online): 2964-2655
The published content of this journal is licensed under a Creative Commons Attribution 4.0 International License.
Creative Commons License