Determining the prognosis of patients with myelodysplastic syndromes using machine learning

Bookmark and Share
Published: 2 Dec 2018
Views: 1635
Dr Aziz Nazha - Cleveland Clinic, Cleveland, USA

Dr Aziz Nazha speaks to ecancer at ASH 2018 about the use of machine learning in determining the prognosis of patients with myelodysplastic syndromes. 

He explains that using a machine learning algorithm improves prognosis accuracy for these patients.

Dr Nazha concludes that while there have been advancements in personalised medicine and treatments, personalised diagnosis is also very important.

Watch his press conference here.

Read more about this work here.

ecancer's filming has been kindly supported by Amgen through the ecancer Global Foundation. ecancer is editorially independent and there is no influence over content.

The outcome for patients with myelodysplastic syndrome is very heterogeneous, it means that it’s really hard to find two patients that have the same outcome. Our current models that we use in clinical practice use clinical variables to provide outcomes for patients but we typically have shown that the outcome that we predict for a specific patient, there is significant difference between what we are predicting with the model and the actual survival of the patient. So the question came to us as can we build a model that can risk stratify patients and provide personalised predictions that are specific for a given patient and provide information for them as well as the treating physician.

What was the model?

What we have done, we took clinical and mutational data from a cohort of patients treated at the Cleveland Clinic and from Munich Leukaemia Laboratory in Germany; this is about 1,417 patients. We use a machine learning algorithm called Random Survival Forest, so we enter all this clinical data into the algorithm and then after that everybody talks about machine learning as a black box so we tried to extract features from the algorithm, in other words, tried to ask the algorithms what are the important variables that impacted the outcome. This is very important for two reasons: number one, we can dissect those variables and be sure that the algorithm is picking the right clinical variable; also we can learn from the algorithm if there are any new variables that we didn’t discover using traditional statistical methods.
So after we train our model in order for us to use it in the clinic we have developed a web application where the user, in this case it’s the physician, can input the clinical data and the mutational data and the output will be a survival curve that is specific for a patient as well as survival probability at different time points. We then took this model and validated it in a completely independent cohort from patients treated at Moffitt Cancer Center. So now we’re really refining the website that will be available for physicians where they can enter this information and hopefully it can aid them in their decisions.

What is the message for doctors watching this?

The concept of this project was can we develop personalised predictions. We always talk about personalised medicine or giving personalised treatment for our patients but we really don’t spend too much time talking about personalised diagnosis, personalised prediction. So the concept of can we identify risk that is specific for our patients, it’s very important for our patients because our patients ask us for that, and also very important for us as treating physicians because we don’t want to over-treat and undertreat some of the patients by defining their risk wrong.