Articles Technical

Analytics in Healthcare – Xerox Challenge

A brief overview on how I approached a real world healthcare problem via Analytics.

Robin Singh

Disclaimer : I have tried to restrict technicalities to a minimum in the blog, so as to cater to a wider segment of readers. However, a little awareness about machine learning will make the rest of the post even more comprehensible and hopefully exciting.

The healthcare industry is not only  huge but also has a tremendous potential for the use of technology and data science. In this post, I will share one of the numerous instances of the use of unconventional analytics to engineer solutions in response to challenges in the field of healthcare. This seemingly complex idea, can be structured and implemented by a combination of machine learning tools and data crunching techniques.

The solution proposed here is designed to work with critical patient data in hospitals and raise an alarm when the state of the patient degrades, eventually leading to potentially fatal outcomes. Now the obvious questions, How will such a system help? Can the doctors not monitor patient state physically? Well, it is only possible for a doctor to physically monitor a small number of patients. What if the number of patients is large?. Also, the decision to provide intensive care to patient after an alarm has been raised has monetary and human life impacts. If the alarm can be raised in time, intensive care although expensive, can be provided to the patient.

At the backend, the model can be seen as a typical classification machine learning problem . Classification, as the name suggests is a method to categorize data points into predetermined target groups. Numerous algorithms can do classification like Bayesian model, decision tree, random forest, regression etc. We used Random Forest model for the current data set, due to simplicity and ease of implementation.

However, the classification method happened to be only the tip of the iceberg. There were many unforeseen challenges –  primarily due to data coming from the healthcare domain and also computational resource constraints. First, healthcare data is highly erratic and the severity of a measurement varies from person to person. For example,  a certain value of a respiratory measurement can be dangerous and life risking for a normal person but normal for a smoker. This poses a fundamental challenge to the accuracy of models built on the healthcare data. Second, the  state to be predicted is different from the state whose training data is available. This is slightly difficult to grasp, but lets try. We want to raise an alarm when the patient’s situation is worsening from normal and approaching mortality but still the patient has time. However, the training dataset has information on the actual mortality/no-mortality. Using the training data to learn will imply making an approximation. The third challenge comes from implementation aspects. The prediction of no-mortality should be highly reliable as compared to prediction of mortality. The system should be able to predict the no-mortality situations with an accuracy of 99% or above. Accuracy in no-mortality and mortality have a trade-off and hence if we tune the model for high accuracy in no-mortality then the accuracy on mortality is low.

Let’s take a moment to think about the methodology again. What can we observe? The predictive model seems to be replicating logic similar to a real doctor. In fact, the very idea of machine learning is to train the machines to apply logic like human beings do. For example, using the past data to learn and take decisions in the future cases, considering trade-offs originating from the decision making process and using the concept of information value to take  decision.

Discussed above is one example on the use of analytics and artificial intelligence in the healthcare scenario. There are many unexplored applications in the domain, a huge scope for improvement in the existing models and unquantifiable amount of data to process. In the coming years devices based on such models will be a reality and the industry requires many more analyst to cater to the demand.