A research team led by Professor Park Sang-Hyun of the department of robotics and mechatronics engineering (also in charge of the Artificial Intelligence major) at DGIST (President Kuk Yang) developed a weakly supervised deep learning model that can accurately show the presence and location of cancer in pathological images based only on data where the cancer is present.
Existing deep learning models needed to construct a dataset, in which the location of the cancer was accurately drawn, to specify the cancer site.
The deep learning model developed in this study improved efficiency and is expected to make significant contribution to the relevant research field.
Generally, it is necessary to accurately mark the location of the cancer site to solve the problems involved with zoning that indicates the location information of cancer, which takes a long time and therefore increased cost.
To solve this problem, the weakly supervised learning model that zones cancer sites with only rough data such as 'whether the cancer in the image is present or not' is under active study.
However, it would significantly deteriorate the performance if the existing weakly supervised learning model is applied to a huge pathological image dataset where the size of one image is as large as a few gigabytes.
To solve this problem, researchers attempted to improve performance by dividing the pathological image into patches, but the divided patches lose the correlation between the location information and each split data, which means that there is a limit to using all of the available information.
In response, Professor Park Sang-Hyun's research team discovered a technique of segmenting down to the cancer site solely based on the learned data indicating the presence of cancer by slide.
The team developed a pathological image compression technology that first teaches the network to effectively extract significant features from the patches through unsupervised contrastive learning and uses this to detect the main features while maintaining each location information to reduce the size of the image while maintaining the correlation between the patches.
Later, the team developed a model that can find the region that are highly likely to have cancer from the compressed pathology images by using a class activation map and zone all of the regions that are highly likely to have cancer from the entire pathology images using a pixel correlation module (PCM).
The newly developed deep learning model showed a dice similarity coefficient (DSC) score of up to 81 - 84 only with the learning data with slide-level cancer labels in the cancer zoning problem.
It significantly exceeded the performance of previously proposed patch level methods or other weakly supervised learning techniques (DSC score: 20 - 70).
“The model developed through this study has greatly improved the performance of weakly supervised learning of pathological images, and it is expected to contribute to improving the efficiency of various studies requiring pathological image analysis,” said Professor Park Sang-Hyeon Park who added, “If we can improve the related technology further in the future, it will be possible to use it universally for various medical image zoning issues.”
Meanwhile, the results of this study were recognised for its excellence and were published in MediIA, (Medical Image Analysis Journal).