High-throughput DNA sequencing technologies are leading to a revolution in how clinicians diagnose and treat cancer. The molecular profiles of individual tumours are beginning to be used in the design of chemotherapeutic programs optimised for the treatment of individual patients.
The real revolution, however, is coming with the emerging capability to inexpensively and accurately sequence the entire genome of cancers, allowing for the identification of specific mutations responsible for the disease in individual patients.
There is only one downside. Those sequencing technologies provide massive amounts of data that are not easily processed and translated by scientists. That's why Georgia Tech has created a new data analysis algorithm that quickly transforms complex RNA sequence data into usable content for biologists and clinicians.
The RNA-Seq analysis pipeline (R-SAP) was developed by School of Biology Professor John McDonald and Ph.D. Bioinformatics candidate Vinay Mittal. Details of the pipeline are published in the journal Nucleic Acids Research.
"A major bottleneck in the realisation of the dream of personalised medicine is no longer technological. It's computational," said McDonald, director of Georgia Tech's newly created Integrated Cancer Research Center. "R-SAP follows a hierarchical decision-making procedure to accurately characterise various classes of gene transcripts in cancer samples."
There are at least 23,000 pieces of RNA in the human genome that encode the sequence of proteins. Millions of other pieces help regulate the production of proteins.
R-SAP is able to quickly determine every gene's level of RNA expression and provide information about splice variants, biomarkers and chimeric RNAs. Biologists and clinicians will be able to more readily use this data to compare the RNA profiles or "transcriptomes" of normal cells with those of individual cancers and thereby be in a better position to develop optimised personal therapies.
Personalised approaches to cancer medicine are already in widespread use for a few "cancer biomarkers" including variants of the BRAC 1 gene that can be used to identify women with a high risk of developing breast and ovarian cancer.
"Our goal was to design a pipeline that is easily installable with parallel processing capabilities," said Mittal. "R-SAP can make 100 million reads in just 90 minutes. Running the program simultaneously on multiple CPUs can further decrease that time."
R-SAP is open source software, freely accessible at the McDonald Lab website.
"This is another example of Georgia Tech's ability to merge computer technology with science to create an essential feature of next-generation bioinformatics tools," said McDonald. "We hope that R-SAP will be a useful and user-friendly instrument for scientists and clinicians in the field of cancer biology."
Source: Georgia Institute of Technology
The World Cancer Declaration recognises that to make major reductions in premature deaths, innovative education and training opportunities for healthcare workers in all disciplines of cancer control need to improve significantly.
ecancer plays a critical part in improving access to education for medical professionals.
Every day we help doctors, nurses, patients and their advocates to further their knowledge and improve the quality of care. Please make a donation to support our ongoing work.
Thank you for your support.