Denaxas Lab

Electronic Health Record (EHR) phenotyping
AF Raw EHR require a substantial amount of preprocessing before they can be transformed into research-ready datasets that can be statistically analyzed to answer clinically meaningful questions. Our lab develops computational algorithms for defining, validating and ascertaining multi-modal disease phenotypes in EHR data. Created phenotypes are stored in an open-access Data Portal.
More information: Atrial fibrillation phenotyping exemplar in PLOS ONE.
Unsupervised machine learning for sub-phenotype discovery
AF There is a growing body of evidence from observational and interventional research suggesting that complex diseases, such as type-II diabetes, asthma and chronic obstructive pulmonary disease (COPD), are composed of distinct sub-phenotypes with different risk factor and prognostic profiles. Our lab develops and evaluates data clustering algorithms to identify, describe and evaluate COPD and heart failure. Identifying these disease subtypes can lead to the development of personalized treatments.
Supervised machine learning for risk stratification
AF The majority of traditional risk prediction approaches rely on regression based statistical approaches and potentially fail to take into account the richness of electronic health record data. Our lab evaluates supervised machine learning approaches for creating accurate and interpretable risk prediction tools. Our recent research focused on predicting death in coronary artery disease patients.
Data Linkage
AF Data linkage is the process of identifying and linking individuals across heterogeneous data sources. Working with the Federal University of Bahia, our lab is contributing to the development of scalable probabilistic data linkage methods for linking administrative over 140 million participants in Brazil and evaluating the quality of the linkage using supervised machine learning.