Comparison of Machine Learning Methods for Predicting Employee Absences
Identifiers (Article)
Abstract
Employee absences cannot be avoided but excessive and uncontrolled absences affect not only the companies and employees but also impact the economy, government and society. Though actual losses are hard to compute, absenteeism has been estimated to cost billions in direct and indirect costs. Addressing employee absences is difficult because the underlying reasons and causes are complex and not straightforward. Compounding this, companies do not have tools to analyze and predict the future risk of employee absences, relying instead on retrospective data that may not be relevant to the current situation at hand. In this study, we show how machine learning methods can be used to predict employee absence risks. Results show that Neural Networks give best accuracy (77%) over Random Forest (72%) and Support Vector Machines (62%). The effect of training data size and varied feature sets on the models’ performances were also tested. Also, a method allowing for ranking the sensitivity of a Neural Network to each feature is presented.