Non-coding mutations driving chronic lymphocytic leukaemia
Dr Xose Puente - University of Oviedo, Oviedo, Spain
Chronic lymphocytic leukaemia is the most frequent leukaemia in industrialised countries. We have been studying this disease during the last six years, more or less, and we have been sequencing the genomes of 500 tumours as well as the DNA from normal cells from the same patients in order to characterise which are the molecular alterations present in the tumour cells. From that study we have been able to identify 60 genes that when they are mutated they drive the oncogenic process. At the same time we have been doing whole genome sequencing in 150 tumours. We were wondering whether there… so we have genes but we have 99% of the mutations are in non-coding regions. So our question was is there any region in the genome that doesn’t code for a gene or for the protein that is encoded in the gene that is important for tumour development. In our study what we have found is two regions that don’t code for a protein and that are important, they are mutated at very high frequency in our set.
One of those regions was at the end of the three prime untranslated region of a gene called Notch1. So just like when you have a gene you have all the coding regions in different exomes and at the end you have a stop, a signal to stop the translation of that information. After that you have the untranslated region. In that untranslated region we identified fourteen different tumours harbouring exactly the same mutation in the same spot and that mutation wasn’t in a very conserved region of the genome, it was in something very unspecific. We wondered what is going on here.
At the same time that we were sequencing the whole genome we were also sequencing RNA from the same tumours. When we looked at the effect of that mutation in the RNA what we found is that that mutation created a splicing acceptor site. At the same time as this acceptor site was created there was one cryptic donor site that was in the coding region that usually these mutations are in the last exome. That donor site is not just for anything but as you create one acceptor site here what happens in the cells is that you get a splicing between both two. The result is that 500 bases of the RNA are removed, including 150 bases coding for protein. So the resulting protein is 50 [?? 3:42] shorter. What happens when you remove the last 50 [?? 3:48] of Notch1 that in that particular region you have one signal that tells the cell how much time this molecule, this protein should stay in the cell. So Notch1 is very important for development so when you have something that is important, you want it to make its function and then stop existing. You don’t want it to be there telling the cell to proliferate, to differentiate. So when you remove the last 50 [?? 4:27] of Notch1 what you get is a stable protein, a very stable protein that is the cause of the initiation of the tumorigenesis process in these cells.
What is very important is that by knowing the mechanism by which these non-coding mutations are able to produce cancer, you can see that the strategies that are right now in development to target Notch mutations in the coding region, they can be also applied for the mutations in the non-coding region.
On the other hand, we were also studying another region of the genome with a high number of mutations. About 10% of our tumours harbour mutations in the middle of nowhere, in the intergenic region. When we looked at what was going on there what we found is that you had a lot of transcription binding sites for different transcription factors there. It looked like an enhancer but it wasn’t described. So what we did is to study which genes are in the neighbourhood and whether the tumours with mutations in that particular enhancer show any difference in expression between tumours without mutations and tumours with mutations. So we studied the genes in the neighbourhood and only one gene showed statistically significant changes in expression and that gene was Pax5. Pax5, its expression was reduced when you have mutations in this enhancer and Pax5 is a very important transcription factor in B-cell development. In order to confirm that this was really the mechanism by which these mutations in non-coding regions were affecting the expression of Pax5, what we did was to use genome engineering, so genome editing, in order to remove from cells that particular piece of DNA, that enhancer. We saw that effectively the expression, the expression of Pax5, was reduced. And if we introduced in cells the same mutations that we found in patients also it was reduced.
So what does it mean, this information? That mutations in non-coding regions… so the genome is very large and we still don’t understand very well how it works. We more or less know how it works for protein coding genes but the rest of the genome that is 98% of the genome it’s a little difficult to figure out which part of that DNA is important for disease or for cancer and which part is not. By using these strategies we have been able to identify two non-coding regions that are very important for CLL development.
This all sounds very new.
Yes, until now the only example of mutations in non-coding DNA that were known were in the telomerase gene, in the promotor region of the telomerase gene. So when you have mutations there you get a higher expression of the telomerase that is important for tumour cells. Those mutations are very important or very frequent in melanoma as well as in liver cancer. But, for example, in CLL we don’t have that kind of mutation. But what we have found with the Notch1 mutations is a new way in which you can modify the function of a gene by creating acceptor sites outside of the coding region.
What are the main aims of the project?
Now in the TCGA project as well as in the ICGC, the International Cancer Genome Project, we are generating thousands of genomes. What we are trying to do is to identify more mutations in non-coding regions. We are pretty much sure that there are going to be some mutations that are important in non-coding regions though most of the important mutations are in coding regions. But what our study shows is that as important as it is, the analysis of big data, it is also very important to perform functional studies to validate your in silica findings.
How are you using the data?
In the project we are people from very different backgrounds – you have clinicians, you have molecular biologists, bioinformaticians, some statisticians and all together we are analysing all those data in order to extract as much as information as we can in order to understand how the oncogenic process starts in the cell. Another important thing that I forgot to mention is that Notch1 is the most frequently mutated gene in CLL and it’s a bad prognosis gene. So patients with mutations in that gene usually have a poor prognosis and also a higher risk of transforming into malignant lymphoma. What we have found is that one quarter of the patients with mutations in Notch1 have mutations in non-coding regions and that’s very important because, as you know, right now and it’s going to be adopted worldwide in the next few years, most of the tumours will be sequenced in order to identify which is the prognosis or which is the best treatment for this kind of tumour. If you are losing one quarter of the patients because you are only looking at the coding regions that’s very important.
Do you have a take home message?
Our main conclusion here is that by performing whole genome sequencing and integration analysis we have been able to identify many mutations and basically within that, with the 500 patients that we have sequenced, we have uncovered the most important genes that are mutated in chronic lymphocytic leukaemia and that establish the basis for the personalised medicine that is going to be the standard of care, probably, in the next few years.
What does the future hold?
Right now what we have been doing is basically to clear up the… It’s like you are in a forest and you try to clear up all the vegetation in order to see where the trees are. With that information now it starts another stage in the study of cancer genomics that is going to be more related to the clinic in order to see from the genes that we have uncovered here which ones have a clinical significance. Probably in the next two years or three years we are going to see a lot of information coming out from those studies based on the genes that we have uncovered in this study that we are presenting in this conference.