1 Big data for health Third pillars of CERN: “Storage and treatment of large amount of data and detailed simulation” Prof. Philippe Lambin U.H. Maastricht
2 Why? Reusing data Modified from Deasy et al. Courtesy of Jo Deasy
3 Why? Limitations of Evidence-based medicineConventional Clinical Research High data quality Less then 3% of the patients Highly biased population Randomized trials rarely done for new technologies Low data quantity Controlled Assigned patients “EORTC-RTOG grade” QA/Protocol Biobanking, translational research
4 Example: having no evidence can have dramatic consequencesRutten et al. Lancet Oncology 2008; 9: 494
5 Lambin et al. Aug 28, Radiother Oncol 2013, Adv. Drug Dev 2016The solution? Use the 97%: Rapid Learning Health Care or “Big data in health care” In [..] rapid-learning [..] data routinely generated through patient care and clinical research feed into an ever-growing [..] set of coordinated databases. J Clin Oncol 2010;28:4268 [..] rapid learning [..] where we can learn from each patient to guide practice, is [..] crucial to guide rational health policy and to contain costs [..]. Lancet Oncol 2011;12:933 Examples: Radiotherapy CAT (www.eurocat.info) ASCO’s CancerLinQ Lambin et al. Aug 28, Radiother Oncol 2013, Adv. Drug Dev 2016 5
6 Conventional Clinical Research Rapid Learning Health Care (“Big Data”)High data quality Low data quality Low data quantity High data quantity Controlled Assigned patients “EORTC-RTOG grade” QA/Protocol Biobanking, translational research Reality Unassigned patients “Clinical grade” QA/Protocol Ad hoc biobanking/translational research Relton C et al. BMJ. 2010; Burbach et al. Trials 2015; Lambin et al. Acta Oncol 2015
7 Example of clinically relevant questionsTreatment of 80 years old rectal cancer? 70 years old Stage IIIB NSCLC? 60 years old prostate cancer with oligometastasis? Local relapse of a stage 3 oropharynx? Cervix cancer stage 3, HIV+ … Big Data to save lives
8 Watch the animation: http://youtu.be/ZDJFOxpwqEA
9 ”Pan-omics approach”: Multifactorial Decision Support SystemBut we need Data, preferably all of them How? Lambin et al. Nature Rev. Clin. Oncol.
10 In-hospital infra & de-identificationFirewalls SPARQL Query (distrubuable) Std ROO= Ontology SPARQL = “Simple Protocol And RDF Query Language” a query language for databases Deidentification: Removal of obvious patient identifiers (name, MRN, social security number, etc.) Assign a persistent token pseudonym Change (data banding) of obvious but required patient identifiers (everyone born and died on the 15th of the month, part of the postal code) No individual patient data leaves the hospital
11 Ontology – International Coding System2. Search the ontology for the matching concept 1. Select the local term 声门下区 3. Map the local term to the ontology 4. See the result of your mapping
12 The Semantic Web The Semantic Web is an extension of the Web through standards by the World Wide Web Consortium (W3C). The standards promote common data formats and exchange protocols on the Web. According to the W3C, "The Semantic Web provides a common framework* that allows data to be shared and reused across application, enterprise, and community boundaries". The term was coined by Tim Berners-Lee for a web of data that can be processed by machines. *SPARQL is a semantic query language for databases
13 An ontology is more than a dictionaryOntology is a set terms & their relationships. Then we have “machine readable data” accessible to Artificial Intelligence
14 SPARQL SPARQL (pronounced "sparkle", an acronym for SPARQL Protocol and RDF Query Language) is an RDF query language, that is, a semantic query language for databases, able to retrieve and manipulate data stored in Resource Description Framework (RDF) format.[3][4] It was made a standard by the World Wide Web Consortium, and is recognized as one of the key technologies of the semantic web. In 2013, SPARQL 1.1 became an official W3C Recommendation.
15 SPARQL : Query language for applicationSemantic box (the secondary research database Hospitals SPARQL language for application, ROO radiation oncology ontology SPARQL : Query language for application
16 Funded: euroCAT, duCAT, chinaCAT, VATE, ozCAT New: ukCAT, indiaCAT4 2 7 Active or funded CAT partners (17) Prospective centers Map from cgadvertising.com
17 “from hospital to patient”What next?: The patient managing its own data Data = Gold Our vision in 2 min: “from hospital to patient”
18 From data to models to Virtual patient?
19 The 5 P’s of modern medicine(modified from Leroy Hood) « P » for Personalized « P » for Preventive « P » for Predictive « P » for Participatory « P » for Parcimonious
20 Shared Decision Making 1.0 with Decision aids
21 Shared Decision Making 2Shared Decision Making 2.0: model-based virtual patient or Avatar-based Shared Decision making
22 Data Simulation/DSS: Game changer!Patient avatar (or similar patients) Simulation/DSS: Virtual treatments Virtual clinical trials Virtual scenarios with preventive personalized interventions … Game changer!
23 Big Data to save lives Take home message: we needPrivacy-preserving Big data to build Multifactorial decision support systems And Shared decision making Patient avatars (= model-based virtual patients or similar patients) Used for simulations of virtual treatment in virtual hospitals and virtual clinical trials. Big Data to save lives
24 What next? Backcasting**= a planning method that starts with defining a desirable future and then works backwards to identify policies and programs that will connect the future to the present)
25 Thank you for your attentionPrivacy-preserving Big data to build Multifactorial decision support systems And Shared decision making Patient avatars (= model-based virtual patients or similar patients) Used for simulations of virtual treatment in virtual hospitals and virtual clinical trials.
26 Acknowledgements … Main MAASTRO collaborators CHU Liege, BelgiumUniklinikum Aachen, Germany LOC Genk/Hasselt, Belgium Catherina Zkh Eindhoven, Netherlands Policlinico Gemelli, Roma, Italy UH Ghent, Belgium UH Leuven, Belgium UH Nijmegen, Netherlands … Main MAASTRO collaborators Andre Dekker Cary Oberije Timo Deist Erik Roelofs Arthur Jochems Sean Walsh Ralph Leijenaar Janita van Timmeren
27 Reserve slides
28 Open source data of publications: www.cancerdata.org
29
30 ESTRO Course 2014
31 Can you give me examples of new knowledge coming from Big Data approaches
32 The Radiomic hypothesisOne can extract more quantitative information from standard imaging Radiology: Implicit knowledge Interpretability QUANTIFICATION RADIOMICS Extract quantitative features from images Lambin et al. EJC, 2012; Aerts, Lambin et al. Nature Commun 2014
33 Predict survival in Lung and Head & neck cancer better then TNMAerts…Lambin, Nature Commun 2014; Leijenaar et al. Acta Oncol 2015
34 Entering the OMICS era… RadiomicsLambin et al. EJC, 2012; Aerts, Lambin et al. Nature Commun 2014
35 Watch the animation: http://youtu. be/Tq980GEVP0Y Or the website: wwwWatch the animation: Or the website:
36 Lambin et al. Aug 28, Radiother Oncol 2013, Adv. Drug Dev 2016Take home message We need Decision Support Systems (DSS = a “meta TPS”) to manage the large quantity of data and implement Personalized medicine in radiotherapy in particular for protontherapy due to its costs. Two complementary approaches: conventional clinical trials (+ data reuse) + “Big Data approach” (Rapid Learning Health Care). Building cancer informatics tools to enable analysis, exploration, and rapid evaluation of novel therapies or stratification e.g. Distributed learning based on semantic web technology. DSS facilitate Share Decision Making, participative precision medicine and cost effective Health care (the 4th & 5th “P”). One key example could be protontherapy. Lambin et al. Aug 28, Radiother Oncol 2013, Adv. Drug Dev 2016