Machine learning algorithm improves prognosis accuracy for patients with myelodysplastic syndromes

Thanks for all of you for the opportunity to be here with you today. I’m going to discuss with you an approach that we take to build a personalised prediction model that can predict outcome for patients with a form of blood cancer called myelodysplastic syndrome. Myelodysplastic syndrome is a blood cancer that typically happens with the patients older adults and is associated with very low counts. There is a tendency to progression to acute myeloid leukaemia. The survival for those patients can be measured for some of the patients over years but some of them can be measured in months.
Prognosis in MDS and oncology in general is one of the most important things that we can do because after we make the diagnosis the next step in treating the patients is to stage their disease or identifying the risk. That’s extremely important for the patients because it helps them understand their disease, set up their expectations early and try to help them understand what they expect throughout their journey. Of course, the first question from our patients is always, ‘How long am I going to live?’ For the treating physician it’s also important because all our consensus guidelines and treatment recommendations are based on risk stratification. We typically stratify our patients into lower risk – lower risk of progression, lower risk of progression to acute myeloid leukaemia – and higher risk – higher risk of progressive disease, shorter survival and a higher risk of progression to acute myeloid leukaemia. You can see here the treatment algorithms for those patients are different. For example, for high risk patients we tend to offer them transplant up front as a curative option, we don’t offer that for lower risk patients because the risks from transplant outweigh the benefit. So you can see here if we have the patients that we label the disease as a higher risk and the disease behaves like a lower risk we’re changing the management or maybe we’re over-treating this patient. And vice versa – if you have a patient that our model told us that this patient is a lower risk but the disease behaves like a higher risk that becomes also a problem.
Now, what we also found that we are predicting for the patients and their actual survival is completely different. So if you take, for example, 1,500 patients with MDS, you try to plot their survival. So on the x-axis here you see this is the traditional prognostic model that we use in MDS called the International Prognostic Scoring System revised, and these are the categories that come with the model. So we typically assign to the patients, let’s say, a lower risk and we give a median overall survival and AML transformation. On the y-axis you see here the survival probability of the patients; in the orange triangle you see patients who are dead and the green circles we see patients who are alive.

Now, when we talk about averages, yes, the average survival for patients with low risk is higher than patients with intermediate but when we look specifically at each individual you can find there are some patients with low risk that behave actually worse than some patients with high risk. You can find patients with intermediate risk all over the place, some of them like lower risk, some of them like higher risk. This is really telling us to ask a question, can we build a model that can provide personalised prediction that is specific for a given patient. In order to do that we took an approach to harness the power of artificial intelligence and machine learning to help us do the model.

So, what we did, with machine learning you can take clinical data, genomic data or any type of data, we plug this data into an algorithm and then we want to know what’s going on in the algorithm. So everybody talks about black box and machine learning as a black box, there are several techniques that we can extract features from the model. In other words, trying to ask the algorithm what are the important variables that impacted the outcome and then we extract those variables and we rebuild the model. This has two implications, number one, at least we know what the algorithm is doing and the other thing is try to learn from the algorithm something new that we’ve never seen before using traditional statistical methods.

So we took clinical data, demographic data, genomic data, we have a training cohort from Cleveland Clinic and Munich Leukaemia Laboratory in Europe and we validated our result in Moffitt Cancer Center, data from Moffitt Cancer Center. We ran this data through the machine learning algorithm, we extracted very important variables and in the end we came up with a final model from the algorithm. When you have this model there’s no way you can use this model in the clinic so we have built a web application tool and you can see me here on the web application changing the clinical characteristics and the mutation, it calculates, and you can see the survival curves of the patient change and the survival probability change. Now, I’m changing, I’m putting some poor risk features for this patient, I’m hitting calculate and now you can see it’s a different survival for this patient.

When we look at this approach and we compare it to traditionally the current models or the standard care of models, they are IPSS and IPSSR, this approach significantly outperforms these models. In the past we have incorporated mutational data into the IPSS and IPSSR and we have shown that incorporating this data can improve the predictability of these models. But with this machine learning approach and artificial intelligence we’re able to achieve much higher accuracy in terms of overall survival prediction as well as leukaemia free survival.

Of course this work will never be done without a great collaboration from collaborators in the United States and Europe, from the MDS Research Consortium and the Munich Leukaemia Laboratory and of course the biggest thanks for our funders and all our patients who participated in our research. I would like to thank all of you for your attention and I’m happy to take any questions after.

Machine learning algorithm improves prognosis accuracy for patients with myelodysplastic syndromes

ASH 2018

Related Videos

More from ecancer