Denaxas Lab

We are a multidisciplinary research lab of data scientists, epidemiologists, software engineers and clinicians working at the intersection of medicine and computer science. Our research uses real world data (electronic health records, administrative data, disease audits) and data-driven methods to improve human health and healthcare.

COVID-19 response

Responding to the current public health emergency caused by the SARS-CoV-2 virus and COVID-19 pandemic, our Lab has been leading or providing input into research initiatives:

Members of the Lab are active across major COVID-19 research initiatives:

  • CVD-COVID-UK aims to understand the relationship between COVID-19 and cardiovascular diseases such as heart attack, heart failure, stroke, and blood clots in the lungs through analyses of de-identified, linked, nationally collated healthcare datasets across the four nations of the UK.

  • DECOVID uses detailed and frequently updated health data from hospitals as the COVID-19 pandemic unfolds, to allow clinicians and researchers to generate rapid and robust insights that can lead to more effective clinical treatment strategies, helping patients, healthcare professionals and society.

  • BHF COVIDITY-COHORT This project complements CVD-COVID-UK and will harness the power of over a dozen large UK cohort studies, where participants have already provided a wealth of information about their cardiovascular health, and general health and wellbeing.

Engagement with policy-makers

Research output

  • Lai A. et al. Estimated impact of the COVID-19 pandemic on cancer services and excess 1-year mortality in people with cancer and multimorbidity: near real-time data on cancer care, cancer deaths and a population-based cohort study. BMJ Open 10.1136/bmjopen-2020-043828

    Media coverage: The Guardian, Daily Mail, Evening Standard, BMJ.

  • Banerjee A. et al. Clinical academic research in the time of Corona: a simulation study in England and a call for action. PLOS ONE 10.1371/journal.pone.0237298

  • Dennis J. M. et al. Type 2 Diabetes and COVID-19-Related Mortality in the Critical Care Setting: A National Cohort Study in England, March-July 2020. Diabetes Care 10.2337/dc20-1444

  • Katsoulis M. et al. Estimating the effect of reduced attendance at emergency departments for suspected cardiac conditions on cardiac mortality during the COVID-19 pandemic Circulation: Cardiovascular Quality and Outcomes. 10.1161.CIRCOUTCOMES.120.007085

  • CVD-COVID-UK Consortium. The 4C Initiative (Clinical Care for Cardiovascular disease in the COVID-19 pandemic): monitoring the indirect impact of the coronavirus pandemic on services for cardiovascular diseases in the UK 10.1101/2020.07.10.20151118

  • Mateen B. et al. A geotemporal survey of hospital bed saturation across England during the first wave of the COVID-19 Pandemic 2020.06.24.20139048

  • Katsoulis M. et al. Obesity during the COVID-19 pandemic: cause of high risk or an effect of lockdown? A population-based electronic health record analysis in 1 958 184 individuals. 10.1101/2020.06.22.20137182

Funding:

Research

CALIBER

illustrations

Our lab runs CALIBER, a research platform that provides reproducible phenotyping algorithms for electronic health records. We use data from primary care (Clinical Practice Research Datalink), hospital admissions (Hospital Episode Statistics), socioeconomic deprivation information (using the Index of Multiple Deprivation) and cause-specific mortality data (Office for National Statistics) in England for ~15 million individuals. CALIBER enables researchers to recreate the longitudinal pathway of patients through healthcare settings and study disease onset and progression.

Recent research examples:

UK phenomics platform for developing and validating electronic health record phenotypes: CALIBER. Journal of the American Medical Informatics Association, 2019.

Cite HDR UK Phenotype Portal

Data resource profile: cardiovascular disease research using linked bespoke studies and electronic health records (CALIBER). International journal of epidemiology, 2012.

Cite




Open science

illustrations

Our lab is committed to producing open and reproducible science. We lead the HDR CALIBER Phenotype Library, a comprehensive, open-access resource providing the research community with information, tools and phenotyping algorithms for UK electronic health records data. The Phenotype Library curates approx 30000 controlled clinical terminology terms across 350 rule-based phenotyping algorithms using structured UK EHR data sources. Phenotypes have been extensively validated by generating six layers of evidence: aetiological, prognostic, case-note review, genetic, cross-EHR and cross-country replication.

Recent research examples:

A chronological map of 308 physical and mental health conditions from 4 million individuals in the English National Health Service. The Lancet Digital Health, 2019.

Cite Github

A semi-supervised approach for rapidly creating clinical biomarker phenotypes in the UK Biobank using different primary care EHR and clinical terminology systems.. medRxiv, 2020.

Cite Github




Electronic Health Record (EHR) phenotyping

sample of phenotyping

Raw EHR require a substantial amount of preprocessing before they can be transformed into research-ready datasets that can be statistically analyzed to answer clinically meaningful questions. Our lab develops computational algorithms for defining, validating and ascertaining multi-modal disease phenotypes in EHR data. Created phenotypes are stored in an open-access Data Portal.

Recent research examples:

Defining disease phenotypes using national linked electronic health records: a case study of atrial fibrillation. PloS one, 2014.

Cite

Bleeding in cardiac patients prescribed antithrombotic drugs: electronic health record phenotyping algorithms, incidence, trends and prognosis. BMC medicine, 2019.

Cite




Unsupervised machine learning for sub-phenotype discovery

sample of phenotyping

There is a growing body of evidence from observational and interventional research suggesting that complex diseases, such as type-II diabetes, asthma and chronic obstructive pulmonary disease (COPD), are composed of distinct sub-phenotypes with different risk factor and prognostic profiles. Our lab develops and evaluates data clustering algorithms to identify, describe and evaluate disease subtypes that can lead to the development of personalized treatments.

Recent research examples:

Identifying clinically important COPD sub-types using data-driven approaches in primary care population based electronic health records. BMC Medical Informatics and Decision Making, 2019.

Cite




Supervised machine learning for risk stratification

illustrations

The majority of traditional risk prediction approaches rely on regression based statistical approaches and potentially fail to take into account the richness of electronic health record data. Our lab evaluates supervised machine learning approaches for creating accurate and interpretable risk prediction tools.

Recent research examples:

Application of Clinical Concept Embeddings for Heart Failure Prediction in UK EHR data. NIPS ML4H: Machine Learning for Health arXiv preprint arXiv:1811.11005, 2018.

Cite

Prognostic models for people with stable coronary artery disease based on 115,500 patients from the CALIBER study. EUROPEAN HEART JOURNAL, 2012.

Cite




Data Linkage

illustrations of data linking with scale

Data linkage is the process of identifying and linking individuals across heterogeneous data sources. Working with the Federal University of Bahia, our lab is contributing to the development of scalable probabilistic data linkage methods for linking administrative over 140 million participants in Brazil and evaluating the quality of the linkage using supervised machine learning.

People

Academic members

Avatar

Dr Ana Torralbo

Research Fellow in Health Informatics

Avatar

Dr Arturo Gonzalez-Izquierdo

Senior Research Associate in Electronic Health Records

Avatar

Dr Ghazaleh Fatemifar

AHA Research Fellow

Avatar

Dr Marina Daskalopoulou

Research Associate

Avatar

Dr Michalis Katsoulis

BHF Research Fellow

Avatar

Dr Václav Papež

Clinical Data Scientist

Avatar

Muhammad Qummer Ul Arfeen

Data Manager

Avatar

Natalie Fitzpatrick

Research Data Coordinator

Avatar

Spiros Denaxas

Professor of Biomedical Informatics

Visiting researchers

Avatar

Colin Josephson

Assistant Professor of Neurology and Community Health Sciences (Univ. of Calgary)

Avatar

Dr Maria Pikoula

Clinical Data Scientist

Avatar

Prof Marcos Barreto

Royal Society Newton Fellow

Students

Avatar

Albert Henry

PhD Candidate

Avatar

Andre Vauvelle

PhD Candidate

Avatar

Maxine Mackintosh

PhD Candidate

Avatar

Nonie Alexander

PhD Candidate

Research administration

Avatar

Cécile Brémont

Research administrator

Alumni

Avatar

Christiana McMahon

Graduate Student in Health Informatics

Avatar

Faiz Punakkath

Computer Science Student

Avatar

Hannah Evans

Biostatistician

Avatar

Harry Boutselakis

Health Data Architect

Avatar

Henrietta Forssen

Graduate Student in Machine Learning

Avatar

Kenan Direk

Research Associate in Electronic Health Records

Avatar

Maria Paraskevopoulou

Postgraduate student in Machine Learning

Avatar

Marie Erwood

IMAGINE Data Manager

Avatar

Sam Butler

Undergraduate student in Mathematics

Avatar

Tariq Khatri

Postgraduate student in Machine Learning

Contact