Recently with the development of new antibody-drug conjugates it became clinically relevant to detect not only the classic HER2 positive breast cancers but also those with low levels of HER2 expression, namely HER2 ultra-low and HER2 low breast cancer. However, reproducibility and concordance amongst pathologists remains challenging in these cases of the low spectrum of expression. Actually, the concordance [??] rate amongst pathologists in detecting HER2 ultra-low breast cancers is around 30%.
Therefore, we developed an AI-integrated online training platform to try to enhance concordance amongst pathologists and accuracy in detecting HER2 low levels of expression, comparing findings with AI support versus without AI support.
What was the methodology and what were the findings?
For the methodology we ran HER2 IHC masterclasses comprising a total of 150 pathologists from ten countries in Asia, Africa and South America. Each masterclass was three exams – exams A and B without AI support and exam C with AI support. Each pathologist evaluated 20 cases of HER2 IHC [??] and then for exam C with AI support.
So we compared findings using AI versus without using AI. Our pathologists applied ASCO-CAP 2023 guidelines adapted to include HER2 low and HER2 ultra-low breast cancers using this AI-integrated online platform. The algorithm calculates for each tumour cell the exact pattern of Milburn [??] staining, the percentage of each pattern of Milburn staining, HER2 IHC score and HER2 clinical categorisation. Then we calculated accuracy compared to our reference centre score established by a group of expert pathologists that determined the HER2 IHC categories, which was considered the gold standard. Then we compared findings from exams A and B and exam C to see if AI integration improved accuracy and concordance amongst pathologists.
What are the clinical implications of these findings?
Amongst close to 20,000 readings we detected an increase in sensitivity for detecting HER2 IHC score and concordance amongst pathologists using AI support compared to the reference centre score. And also an increase in sensitivity and concordance amongst pathologists for HER2 clinical categories determination.
Manual scoring had the lowest sensitivity for HER2 low and HER2 ultra-low breast cancers and with AI support we saw an increase in sensitivity in detecting these clinical categories from around 50% to higher than 90%. For HER2 low this increase in sensitivity was from 78% to higher than 90% with AI support.
Finally, we saw a decrease in HER2 low misinterpretation by 24%, meaning that cases and patients had been called HER2 low and they were actually, with the gold standard reference score, they were actually HER2 low or HER2 ultra-low breast cancers. So we could potentially give more patients access to successful IHC therapies using AI with this increase in sensitivity.
Is there anything else you would like to add?
All these AI tools are already commercially available so they can be used, they can be applied in clinical practice already. They are not all approved, for instance here in the US we have some algorithms which have been FDA approved but not all of them. But as long as the pathologists also do the manual scoring and use the AI only as an axillary tool, it can be already applied in clinical routine.
So we can reach this higher sensitivity and accuracy for each HER2 test and more patients can become eligible for successful treatment using AI that is already here. So it’s the present, not the future. But access and costs are problems, especially when we are talking about widely available, global use. We have to have appropriate infrastructure, big cloud spaces and each algorithm licence can be costly as well. So it’s important for the scientific community to discuss how these tools that are highly successful can be increasingly available worldwide.