Special Issue

Personal Health Information Recommender: implementing a tool for the empowerment of cancer patients

Galatia Iatraki1, Haridimos Kondylakis1, Lefteris Koumakis1, Maria Chatzimina1, Eleni Kazantzaki1, Kostas Marias1,2 and Manolis Tsiknakis1,2

1Computational BioMedicine Laboratory, FORTH-ICS, Heraklion GR70013, Greece

2Department of Informatics Engineering, Technological Educational Institute of Crete, Heraklion GR71004, Greece

Correspondence to: Haridimos Kondylakis. Email:


Nowadays, patients have a wealth of information available on the Internet. Despite the potential benefits of Internet health information seeking, several concerns have been raised about the quality of information and about the patient’s capability to evaluate medical information and to relate it to their own disease and treatment. As such, novel tools are required to effectively guide patients and provide high-quality medical information in an intelligent and personalised manner. With this aim, this paper presents the Personal Health Information Recommender (PHIR), a system to empower patients by enabling them to search in a high-quality document repository selected by experts, avoiding the information overload of the Internet. In addition, the information provided to the patients is personalised, based on individual preferences, medical conditions and other profiling information. Despite the generality of our approach, we apply the PHIR to a personal health record system constructed for cancer patients and we report on the design, the implementation and a preliminary validation of the platform. To the best of our knowledge, our platform is the only one combining natural language processing, ontologies and personal information to offer a unique user experience.

Keywords: medical information, recommendations, information retrieval, natural language processing

Copyright: © the authors; licensee ecancermedicalscience. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Published: 11/07/2018; Received: 16/12/2017


The explosion of the Internet and the rapid increase of available information have greatly affected the way information is accessed and distributed. More and more patients are using the Internet to look for health and medical information. As early as 2002, 80% of all adults online in the United States were estimated to have looked for health information, whereas 23% of Europeans were using the Internet to be informed about their health needs [1]. Patients, by being able to search for health information, are feeling more empowered, active and more inclined towards being involved in decision making regarding their own health [2].

However, despite the tangible benefits of this vast amount of information online, there are certain challenges as well. Besides the large volume, the quality of information is very diverse, ranging from journal articles, social media opinions, forums and so on. As such, a patient should possess the necessary skills to identify and accurately judge the relevance of information relating to his or her own case [1]. Furthermore, online information can lead to patients being misinformed, becoming distressed, and increasing the tendency towards self-diagnosis or self-treatment [3].

With this in mind, this paper focuses on current research activities related to the implementation of the Personal Health Information Recommender (PHIR). The PHIR [4] is targeted at improving the opportunities that patients have to inform themselves on the Internet about their disease and possible treatments, and providing them with personalised information and recommendations [5]. Its goal is threefold:

(a) To deliver relevant information to patients, based on their current profile as represented in their personal health record (PHR) data,

(b) To ensure the quality of the presented information by giving medical experts the chance to control the information that is given, and

(c) To facilitate an easy uptake of the new system, by minimising the necessary manual effort.

With these aims, our contributions are the following:

(a) A set of two individual apps (the PHIR search engine and the Semantic Annotator app) being able to be directly integrated into a PHR system.

(b) The Semantic Annotator app can be used by clinical experts to register validated, high-quality web documents and multimedia material to the platform.

(c) As soon as the relevant information is registered to the platform, the system indexes and semantically uplifts the available information.

(d) The PHIR search engine is then available to patients enabling them to search for useful health-related information. The PHIR search engine advances searching by providing an intelligent recommendation service, which exploits the individual patient profile and patient preferences to provide personalised information.

To the best of our knowledge, PHIR is the only medical information search engine and recommendation system that exploits the individual profile of the patient to provide personalised empowerment.

The whole platform has been developed in the context of the iManageCancer EU ( research project for empowering cancer patients [6, 7]. The project aims to build an advanced, standards-based and scalable semantic integration environment [8], enabling seamless, secure and consistent bi-directional linking of clinical research and clinical care systems, which amongst others, will empower patients to extract the relevant data out of the overwhelmingly large amounts of heterogeneous data and treatment information. Currently, PHIR is piloted with cancer patients in Germany and Italy providing high-quality documents selected by healthcare providers.

The rest of this paper is structured as follows. The section ‘Related work’ presents related work. The section ‘Architecture’ demonstrates the PHIR architecture, describing in detail the available components. Then, the section ‘Evaluation design’ demonstrates the design of our evaluation and a preliminary validation of the platform. Finally, the section ‘Conclusion and discussion’ concludes the paper and discusses future directions.

Related work

There are already several approaches aimed at providing patients with search engines with high-quality medical information such as WebMD (, MayoClinic Patient Care (, Medicine Plus ( and so on. However, these engines provide a rather limited set of information and they are not dynamically adapted according to the patient’s preferences or medical history, which is true in our case.

One of the most well-known search engines is HONSearch [9], which aims to improve the quality of information aimed at both patients and medical professionals by facilitating quick access to the most relevant and up-to-date medical discoveries. Each indexed web document has to fulfil specific criteria in order to be included in the search engine and it uses a multi-stakeholder approach to include the relevant web documents. Although the approach seems to be promising, there have been some concerns relating to the quality [10] on the indexed websites. In a similar vein, but targeting medical experts, AskHermes [11] is an online question answering system aimed at answering specialised clinical questions.

Alternative approaches already exploit the profiles stored in personal health systems [12, 13] to automatically present useful information to patients. For example, adverse drug effects [14] are an alerting system to inform patients about potential risks and the adverse effects of the medications they receive. STEPPS [15] is another approach in the same arena, trying to personalise the automatic retrieval of health information using profiling information. The information is retrieved from an electronic patient record. However, the lack of semantics leads to poor results and the lack of a search engine to allow patient interaction limits the patient’s options.

Another work that focuses on combining social data with a search engine in order to provide valuable information about side-effects, medications and to track the geographic location of influenza epidemics is [16]. Although this approach identifies that added information can really improve the quality of the recommendations offered to the end users, the results are not personalised according to the individual profile.

Furthermore, Stratigi et al [17] exploit semantic information available through the International Statistical Classification of Diseases and Related Health Problems 10th Revision (ICD-10) taxonomy to provide recommendations to a small group of patients. However, they do not offer a search engine where users can look for their interested topic. In addition, in [18], semantic information is being exploited for enhancing the results of a recommendation service that facilitates the retrieval of results from electronic health records. Their results show that the quality of the results is highly increased when query recommendation using this information is used. Our approach indeed moves in this direction, combining semantics with recommendations but this time offering high-quality web documents to patients.

This approach builds upon the Interactive Empowerment Services developed in the p-Medicine project [19, 20], where a preliminary conceptualisation of the current approach was implemented. To the best of our knowledge, PHIR is the only system exploiting patients’ profiles to provide both automatic and nonautomatic high-quality information to patients employing semantics, reasoning and also exploiting user preferences.


The architecture of the system is shown in Figure 1 and consists of three layers: the database layer which includes three different databases, the service layer which includes two services and the front-end layer which includes two individual apps (one for the physician and one for the patient). The PHIR has been integrated into iPHR (the PHR system developed within the iManageCancer project) as a set of individual applications. We analyse in detail each one of the aforementioned modules.

Figure 1. PHIR architecture: the architecture schema of the PHIR. The different modules of the system are described in this schema.

PHIR for physicians

The first step in providing useful information to patients is the identification of useful, reliable, high-quality online health information and its appropriate and efficient use. To cope with the unprecedented volume of healthcare information available on the Internet, PHIR uses domain experts (doctors and health care providers) in order to identify and register appropriate web documents. Currently, there are indexed 855 distinct URLs selected by our domain experts for the cancer domain, also available through the web page of the engine.

The Annotator

Using the Annotator app, an expert is able to register external links to documents and videos that contain useful information to be further elaborated. Those documents are high-quality web resources (web pages, pdfs, docs, online videos, etc.) selected carefully by the aforementioned domain experts, targeting patient’s needs. The interface of the app is shown in Figure 2.

In order to register an entry, domain experts just have to fill in a form using the Annotator tool including the URL of interest, a title that describes it, the type (text or video), the language of the content and tags (not obligatory). The tags and the textual content of each URL are parsed using natural language processing, as a pre-processing step, in order to identify the part of speech per word and lemmas. Lemmas of nouns are then matched against concepts from the biomedical ontology SNOMED CT using the Apache Solr server (, performing similarity matching. The identified terms of the SNOMED CT which have a high similarity to the available document/tags are automatically stored along with the specific web resource (document).

Besides the individual web document indexed, a web crawler systematically browses the available URLs in this web document in order to retrieve and index the contents of the entire website. An example is shown in Table 1. To achieve that, only the URLs found within the same domain name are inserted and indexed in the data server. An interesting feature of our Annotator is that it supports multiple languages, i.e. English, German and Italian and the corresponding official translations of SNOMED CT. We have to note that other ontologies can be used as well and actually using them is only a matter of providing the corresponding ontology files to the Annotator. However, we selected SNOMED-CT due to its extensive coverage of medical terms and the available translations in all three mentioned languages.

Figure 2. The Annotator app interface: the interface of the annotator application.

PHIR for patients

The recommendation service

The results presented to the patient, by the PHIR Search Engine are provided by the recommendation service. The service considers the following databases to make the results of the query as personalised as possible:

(a) Patient preferences database: this database contains user preferences that are acquired as the patient browses the results presented to him/her. User’s selections and rating for a specific search result are logged to augment similar future searches.

(b) Medical conditions database: the medical conditions of the patient as they have been logged by the patient himself. The entries in the database are annotated using SNOMED-CT, Logical Observation Identifiers Names and Codes (LOINC) and RXTerms terms while the patients do data entry. We have to note that numerical fields are not annotated.

(c) SOLR indexing database: the database with the auto-generated tags from SNOMED-CT along with the annotated tags that the domain experts have provided.

The recommendation service is the bridge among the patient’s information, preferences and the indexed documents. The content that is identified as relevant is scored according to its dynamic relevance to the patient. In addition to the objective relevance, the system takes into account subjective information such as the content, which the patient has already seen, or the type of content the patient prefers (articles for the general public or specialist information, text or image, etc.).

Semantic search has been proven efficient even for challenging domains such as the biomedical resource discovery [21]. The algorithm used for semantic search in the PHIR follows a gradual combination of the different results based on the user’s profile and the user’s query as shown in Figure 3.

The results are presented as a list, containing the SOLR response set to questions in the following hierarchy:

(a) Union of answers based on the user’s provided medical profile and the user’s given query

(b) Remaining answers based on the user’s provided medical profile

(c) Remaining answers based on the user’s query

The sorting of SOLR answers is based on SOLR analysis and indexing process.

The iPHR supplies feedback about the displayed information from the patient, either implicitly or explicitly, such that the relevance of the displayed information can be adopted over time. The process of adapting content to specific user needs can be thought of as two main sub-processes. The first sub-process involves understanding what content can be most relevant to the current user’s interests, and how this content should be organised. The second sub-process involves a decision on how to effectively present the selected content to the user.

Table 1. Crawler extracted information—an example: the table illustrates the information extracted from the crawler mechanism which is a part of the Annotator app. The first column shows an example of a URL imported in the Annotator by an expert. The second column shows the URLs extracted from the given one that will also be imported in the system.

Figure 3. Search Engine app interface: the interface of the search engine application.

The PHIR search engine application

Using the iPHR system, the patient, besides logging and reviewing their medical information, is able to search for relevant, high-quality information using an intelligent search engine provided as an individual iPHR app. The interface of the search engine is shown in Figure 3. The Search engine app sends the user query to the recommendation service and then the app visualises the returned results. The returned results are shown in tabs. A tab includes all the results and the next two tabs have them separated by type (text and video).

As well as searching for relevant results, the patient is able to rate a result according to his/her opinion using a scale of one to five. The clicks and the rankings of each user are stored in the patient preferences database, which in turn enhances the personalisation factor of the searching mechanism in future searches.

The PHIR search service

The PHIR Search Engine is also available to external applications through an application programming interface (API). Using the available API, an application can forward a query to the PHIR in order to receive the results in a list. An example of the JavaScript Object Notation (JSON) results returned is shown in Figure 4 whereas the corresponding JSON results returned are shown in Figure 5.

The returned results can then be visualised and processed appropriately by the external application.

Evaluation design

In this section, we present the design of our initial evaluation in order to test the PHIR system. To evaluate our system, as already mentioned, domain experts from the iManageCancer project have already registered a plethora of web documents. Those documents contain high-quality information for patients in the cancer domain, in three languages: English, German and Italian. Those documents were selected by contact persons in the cancer domain from the (a) Information Office at the cancer charity Tenovus Cancer Care (, (b) ecancer (, (c) the European Institute of Oncology ( in Milan and (d) the University Hospital of Saarland (

Figure 4. Search engine results: a Venn diagram that represents the search engine results and their combination.

Figure 5. Example of data representation in SOLR: an example of a document to be inserted in SOLR and its fields. Each field has a name and content.

For quality assurance, norms defined by the International Organization for Standardization (ISO) such as the ISO/International Electrotechnical Commission (IEC) 25000 standard ( have been used as a reference model. PHIR components passed (a) a technical verification assuring that each component consistently produces the expected results, (b) a technical validation assuring the quality of the desired outcome conforming to user needs and intended uses and (c) an internal evaluation demonstrating that the system has a positive benefit to the end users. Table 2 provides an overview of the validation and evaluation results for each sub-characteristic proposed by the ISO/IEC 25000 standard. Performance, resource utilisation and capacity were measured using a web performance tool (, while the rest of the characteristics and sub-characteristics have been measured based on the functionality of the system and user requirements.

Furthermore, physicians participating in the iManageCancer consortium from the European Institute of Oncol-ogy and the University Hospital of Saarland guided the developers and performed an initial evaluation. They provided feedback with the main objective to create a tool that they would recommend to their patients. After two internal iterations (evaluation workshops with physicians as end users), and a preliminary evaluation using a small set of patients at a cancer conference in Milan, the PHIR reached its stable stage (version 1.0 of the tool). The results of the internal evaluation indicate the potential for the tool to help patients to gather precise information about their diagnosis and treatment.

Table 2. Evaluation results.

The main evaluation of PHIR will be based on the pilots of iManageCancer, which will assess the benefit of the PHIR and the PHR platform as a whole. The iManageCancer platform will be evaluated in two healthcare domains for children and adults. The enrolment of the patients has already started and is estimated that by mid-2018, 60 nephroblastoma patients and their parents will participate in the children’s pilot and 120 prostate and breast cancer patients will participate in the adult’s pilot. Both the pilots will have a duration of at least 6 months. Patients will answer usability questionnaires during the period of the pilot. The system also logs the time each patient spends using PHIR and how frequently he/she has used the system. The personalised preferences (user can rate if the proposed content is close to what he/she expected to view) are also logged. The results of these three different sources will identify the usability and the usefulness of our search engine, whereas analytics will be used to provide further insights on the PHIR usage [22]. Although the number of patients is limited, we expect this to give us some directions and hints for the future development of our engine.

Conclusion and discussion

The changing nature of information distribution due to the evolution of the Internet has important implications for health care. Given the wide use of the Internet in providing medical information, providing patients with appropriate content might further enhance the patient’s education and experience. The validity and the quality of the available healthcare information on the Internet is an area of major concern mainly because these have not been well-documented [23]. Although healthcare professionals should continue to strive to be the main source of information for patients, we should also be aware that most will continue to use the Internet to gather information [24]. Berg et al [25] concluded that the optimal solution for patients is to be guided by healthcare providers to more optimal resources over the web. Delivering accurate sources to the patient increases his/her knowledge and changes the way of thinking which is usually referred to as patient empowerment. As a result, the patient’s dependency for information from the doctor is reduced. Moreover, patients feel autonomous and more confident about the management of their disease [26].

The PHIR platform focuses on making the available information timelier and more relevant with respect to dynamic influences on the individual patient’s treatment. While PHIR focuses on patient’s recommendations, the architecture of the system can be applied to other fields, such as tools discovery [21], by replacing the ontologies and the search cohort. The idea behind PHIR is that even if two patients suffer from the same disease and they are in the same phase of treatment, their interests regarding the available information may differ based on various factors. For example, patients might have different medical backgrounds which are not directly related to the treatment, they might receive additional drugs due to other, independent treatments, or they might be affected by other external factors such as the weather in case of allergies. More subjective factors include patient preferences for more simple or more complex information. To the best of our knowledge, no other system providing medical information is able to be dynamically adapted in such a diverse environment. After the end of the pilots, we plan to provide the PHIR source code for free under Apache License 2.0.

Conflicts of interest

The authors declare no conflict of interest.

Authors’ contributions

Galatia Iatraki and Maria Chatzimina implemented the tool and wrote the implementation part of the paper. Haridimos Kondylakis and Lefteris Koumakis designed the algorithms behind the implementation and described the corresponding parts of the paper. Kostas Marias and Manolis Tsiknakis supervised the development and contributed with the introduction and the discussion sections.


This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643529 (iManageCancer project). This article reflects the authors’ view. The Commission is not responsible for any use that may be made of the information it contains.


1. McMullan M (2006) Patients using the Internet to obtain health information: how this affects the patient–health professional relationship Patient Educ Couns 63(1) 24–28

2. Broom A (2005) Virtually he@lthy: the impact of Internet use on disease experience and the doctor-patient relationship Qual Health Res 15(3) 325–345 PMID: 15761103

3. Ahmad F, Hudak PL, and Bercovitz K, et al (2006) Are physicians ready for patients with Internet-based health information? J Med Internet Res 8(3) e22 PMID: 17032638 PMCID: 2018833

4. Kondylakis H, Koumakis L, and Rüping S, et al (2014) PMIR: a personal medical information recommender Med Inform Eur 205 1193

5. Kondylakis H, Koumakis L, and Kazantzaki E, et al (2015) Patient empowerment through personal medical recommendations MedInfo 216 1117

6. Kondylakis H, Bucur A, and Dong F, et al (2017) iManageCancer: developing a platform for Empowering patients and strengthening self-management in cancer diseases 30th IEEE Int Symp Computer-Based Medical Syst—IEEE CBMS 755–760

7. Graf N, Kondylakis H, and Koumakis L, et al (2018) Patient empowerment with the help of ICT—the iManageCancer project IEEE BHI

8. Kondylakis H, Koumakis L, and Tsiknakis M, et al (2018) Implementing a data management infrastructure for big healthcare data IEEE BHI 361–364

9. Boyer C, Selby M, and Scherrer JR, et al (1998) The Health On the Net Code of Conduct for medical and health Websites Comput Biol Med 28 603–610

10. Khazaal Y, Chatton A, and Zullino D, et al (2012) HON label and DISCERN as content quality indicators of health-related Websites Psychiatr Q 83 15–27

11. Cao Y, Liu F, and Simpson P, et al (2011) AskHERMES: an online question answering system for complex clinical questions J Biomed Inform 44(2) 277–288

12. Genitsaridi I, Kondylakis H, and Koumakis L, et al (2013) Towards intelligent personal health record systems: review, criteria and extensions Procedia Comput Sci 21 327–334

13. Kondylakis H, Spanakis M, and Sfakianakis S, et al (2015) Digital patient: personalized and translational data management through the MyHealthAvatar EU project IEEE Engineering Med Biol Society (EMBC) 1397–1400

14. Roitman H, Mesika Y, and Tsimerman Y, et al (2010) Increasing patient safety using explanation-driven personalized content recommendation ACM Intl Health Informatics Symp (IHI) 430–434

15. Doupi D, Wijk M, and Wyk J et al (2008) Personalizing patient education using internet health resources and EPR data: pilot evaluation of the STEPPS prototype system CBMS 449–454

16. Lali M, Ul-Mustafa R, and Saleem K, et al (2017) Finding healthcare issues with search engine queries and social network data Int J Semantic Web Inf Syst 13(1) 48–62

17. Stratigi M, Kondylakis H, and Stefanidis K (2017) Fairness in group recommendations in the health domain 33rd IEEE Int Conf Data Engineering (ICDE) 1481–1488

18. Hanauer D, Wu D, and Yang L, et al (2017) Development and empirical user-centered evaluation of semantically-based query recommendation for an electronic health record search engine J Biomed Inform 67 1–10

19. Kondylakis H, Kazantzaki E, and Koumakis L, et al (2014) Development of interactive empowerment services in support of personalized medicine eCancer Med Sci J 8 400

20. Kondylakis H, Koumakis L, and Genitsaridi E, et al (2012) IEmS: a collaborative environment for patient empowerment 12th Int Conf Bioinformatics Bioengineering (BIBE) 535–540

21. Sfakianaki P, Koumakis L, and Sfakianakis S, et al (2015) Semantic biomedical resource discovery: a natural language processing framework BMC Med Inform Decis Mak 15(1) 77

22. Koumakis L, Kondylakis H, and Chatzimina M, et al (2016) Designing smart analytical data services for a personal health framework pHealth 224 123–128

23. Agliardi A and Jadad AR (2002) Examination of instruments used to rate quality of health information on the Internet: chronicle of a voyage with an unclear destination BMJ 324 569–573

24. Scullard P, Peacock C, and Davies P (2010) Googling children’s health: reliability of medical advice on the Internet Arch Dis Child 95(8) 580–582

25. Berg GM, Hervey AM, and Atterbury D, et al (2014) Evaluating the quality of online information about concussions JAAPA 27(2) 1–8 PMID: 24463744

26. Wiesner M and Pfeifer D (2010) Adapting recommender systems to the requirements of personal health record systems Proceedings 1st ACM Int Health Informatics Symp 410–414

Related Articles

Josepmilly del Valle Peña Colmenares, Carmen Cristina García, Yazmin José Velásquez Velásquez, Leider Arelis Campos Pino, Álvaro Gómez Rodríguez, Wladimir José Villegas Rodríguez, David José González Vargas, Douglas José Angulo Herrera
Table of Contents
Table of Contents