News

Novel approach for precision medicine and drug discovery on gene expression data

17 Nov 2016
Novel approach for precision medicine and drug discovery on gene expression data

Insilico Medicine has announced the publication of its in silico Pathway Activation Network Decomposition Analysis (iPANDA), a novel approach for analysing signalling and metabolic pathway perturbation states using gene expression data, in Nature Communications.

iPANDA is a scalable robust method for rapid biomarker development using gene expression data.

In the present work Insilico Medicine team together with collaborators from the Johns Hopkins University, Albert Einstein College of Medicine, Boston University, Novartis, Nestle and BioTime demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures for breast cancer patients according to their sensitivity to neoadjuvant therapy.

"In this paper we describe one of our most sensitive and biologically-relevant approaches for integrating and analysing gene expression data and present its application for stratifying responders and non-responders to targeted chemotherapy. However, the applications of this method stretch beyond precision medicine and we use it primarily for artificial intelligence and ageing research. My team and many of our collaborators also use it to develop targetable tissue-specific signatures of senescence, cross-species gene expression analysis ", said Ivan Ozerov, PhD, head of senolytics group at Insilico Medicine and the lead author on the paper.

"iPANDA is a universal tool, which allows us to perform in-depth analysis of the effects of external perturbations on the activation of signaling pathways and how it affects the downstream targets. A systematic use of such approach will contribute to obtain a better understanding of how genes involved in various cancers, disease and age-related diseases are dynamically controlled by sets of highly complex networks of signalling pathways. Information gained using in silico approaches are helpful to design more efficient therapies", said Ksenia Lezhnina, head of the "Nutriomi" personalised omics-informed nutrition group at Insilico Medicine and the co-author of the paper.

The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores.

While modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or disease condition, iPANDA produces highly consistent sets of biologically relevant biomarkers acquired on multiple transcriptomic data sets.

"Deep learning is turning into a lego game, where the various techniques are combined as blocks of sophisticated architectures. However, ensuring biological relevance of the outputs and learned or generated features is difficult. When we started working on this method in 2014, we came up with the criteria for pathway activation scoring quality metrics that included the ability to find highly discriminative pathway markers, biological relevance, cumulative effect, batch effect minimization, consistent performance at the platform- , species-, tissue- and experiment-level, and most importantly, the ability to reduce the dimensionality of gene expression data for the deep neural networks. iPANDA was conceived with all these criteria in mind and we use it primarily for dimensionality reduction in applications utilising deep learning", said Alex Zhavoronkov, PhD, CEO of Insilico Medicine.

Signalling pathway activation scoring using iPANDA will likely help reduce the dimensionality of expression data without losing biological relevance and may be used as an input to rapidly developing deep learning methods especially for drug discovery applications.

Using iPANDA values as an input data seems to be a particularly high-potential approach to obtaining reproducible results when analysing transcriptomic data from multiple sources.

Therefore, while there is no single preferential approach for interpreting gene expression results, the iPANDA method of transcriptomic data analysis on the signalling pathway level may not only be useful for discrimination between various biological or clinical conditions, but may aid in identifying functional categories or pathways that may be relevant as possible therapeutic targets.

"Gene expression profiles generated on microarray equipment is one of the most abundant biological data types with hundreds of thousands of published experiments. A method which allows scientists to make better use of this data, perform quality control and integrate it with RNASeq and protein expression data is extremely important", said Charles Cantor, PhD, the co-founder and former CSO of Sequenom, former CSO of the Human Genome Project from the DOE and advisor to Insilico Medicine, a co-author of the publication.

"The iPANDA algorithm is a very sensitive pathway perturbation analysis method, which allows for granular comparison of the differentiation states of human cell and tissues using highly sparse gene expression data. It also serves as a biologically-relevant dimensionality reduction and feature selection tool for training the deep neural networks", said Michael West, Ph.D., CEO of BioTime, Inc

Source: Insilico Medicine, Inc.