Problem
Current technological trends in business also increasingly involve the use of Machine Learning. Complex correlations can thus be analysed in a shorter time and with greater accuracy. Especially in the medical field, this even makes developments possible that not only increase profit, but also the health of people. The practical relevance and my interest in creating an insight into the future using Machine Learning led me to choose the project in Predictive Process Modelling.
Implementation
For predicting the remaining time of processes in hospitals, machine learning and deep learning based approaches are realized in Python with Scikit-Learn and PyTorch. The evaluated models include Support Vector Machines, K Nearest Neigbors, XGBoost, Feed Forward Neural Networks and Long Short Term Memory Networks. Each model is tested with different preprocessing techniques. The analyses of the machine learning and deep learning based approaches are performed on the “sepsis_cases_1” dataset. To investigate the effect of different design choices for the machine learning based approach, a search space is defined. Ray Tune is used to optimize the design choices in the search space by minimizing the resulting RMSE-Score. For the analysis of the deep learning based approach, the hyperparameters of FNN's and LSTM's are optimization manually.
Results
Looking at the results of the machine learning against the deep learning based approaches, it can be concluded, that the machine learning based approaches achieved better results. While machine learning models reached an RMSE score of 0.02 and a MAE score of 762.94 on the test data, neural networks were not able to learn the prediction of remaining times and performed way worse.
Code
https://github.com/eric-official/predictive-process-monitoring