Our technology

Babylon’s Peer-Reviewed Research

Neural Temporal Point Processes for Modelling Electronic Health Records

Joseph Enguehard, Dan Busbridge, Adam Bozson, Claire Woodcock, Nils Hammerla

To address their temporal nature, we treat EHRs as samples generated by a Temporal Point Process (TPP), enabling us to model what happened in an event with when it happened in a principled way. Our proposed attention-based Neural TPP performs favourably compared to existing models, and provides insight into how it models the EHR, an important step towards a component of clinical decision support systems.

Published in 2020 | ML for Health Workshop, NeurIPS 2020

A comparison of artificial intelligence and human doctors for the purpose of triage and diagnosis

Adam Baker, Yura Perov, Katherine Middleton, Janie Baxter, Daniel Mullarkey, Davinder Sangar, Mobasher Butt, Arnold DoRosario

We performed a validation study of the accuracy and safety of the babylon AI triage system and human doctors using a set of identical clinical cases. Overall, we found that the AI system is able to provide patients with triage and diagnostic information with a level of clinical accuracy and safety comparable to that of human doctors.

Published in 2020 | Frontiers in Artificial Intelligence Medicine and Public Health

Applying Artificial Intelligence Methods For The Estimation Of Disease Incidence: The Utility Of Language Models

Yuanzhao Zhang, Robert Walecki, Joanne Winter, Felix Bragman, Sara Lourenco, Chris Hart, Adam Baker, Yura Perov and Saurabh Johri

AI-driven digital health tools often rely on estimates of disease incidence or prevalence, but obtaining these estimates is costly and time-consuming. We demonstrate that context-aware machine learning models can be used for estimating disease incidence. These methods are quicker to implement than traditional epidemiological approaches. We therefore suggest it complements existing modelling efforts, where data is required more rapidly or at larger scale. This may particularly benefit AI-driven digital health products where the data will undergo further processing and a validated approximation of the disease incidence is adequate.

Published in 2020 | Frontiers in Artificial Intelligence Medicine and Public Health

Biomedical Concept Relatedness – A large EHR-based benchmark

Claudia Schulz, Josh Levy-Kramer, Camille Van Assel, Miklos Kepes and Nils Hammerla

A promising application of AI to healthcare is the retrieval of information from electronic health records (EHRs), e.g. to aid clinicians in finding relevant information for a consultation or to recruit suitable patients for a study. This requires search capabilities for beyond simple string matching, including the retrieval of medical concepts (diagnoses, symptoms, meditations, etc) related to the one in question. We open-source a novel medical concept relatedness benchmark, which is six times larger than existing datasets and consists of concept pairs that co-occurr in EHRs, ensuring their relevance for medical information retrieval from EHRs.

Published in 2020 | Coling 2020

Improving the accuracy of medical diagnosis with causal machine learning

Jonathan G. Richens, Ciarán M. Lee & Saurabh Johri

Machine learning promises to revolutionize clinical decision making and diagnosis. In medical diagnosis a doctor aims to explain a patient’s symptoms by determining the diseases causing them. However, existing machine learning approaches to diagnosis are purely associative, identifying diseases that are strongly correlated with a patients symptoms. In this paper we show that this inability to disentangle correlation from causation can result in sub-optimal or dangerous diagnoses. To overcome this, we reformulate diagnosis as a counterfactual inference task and derive counterfactual diagnostic algorithms. We compare our counterfactual algorithms to the standard associative algorithm and 44 doctors using a test set of clinical vignettes. While the associative algorithm achieves an accuracy placing in the top 48% of doctors in our cohort, our counterfactual algorithm places in the top 25% of doctors, achieving expert clinical accuracy. Our results show that causal reasoning is a vital missing ingredient for applying machine learning to medical diagnosis.

Published in 2020 | Nature Communications

Estimating Mutual Information Between Dense Word Embeddings

Vitalii Zhelezniak, Aleksandar Savkov, April Shen, Nils Hammerla

Some of the top approaches to semantic textual similarity rely on various correlations between word embeddings, including the famous cosine similarity. We show that mutual information between dense word embeddings, despite being difficult to estimate, is another excellent candidate for semantic similarity and rivals existing state-of-the-art unsupervised methods.

Published in 2020 | ACL Journal

Hybrid Reasoning Over Large Knowledge Bases Using On-The-Fly Knowledge Extraction

Stoilos, Giorgos and Juric, Damir and Wartak, Szymon and Schulz, Claudia and Khodadadi, Mohammad

The success of logic-based methods for comparing entities heavily depends on the axioms that have been described for them in the Knowledge Base (KB). Due to the incompleteness of even large and well engineered KBs, such methods suffer from low recall when applied in real-world use cases. To address this, we designed a reasoning framework that combines logic-based subsumption with statistical methods for on-the-fly knowledge extraction.

Published in 2020 | European Semantic Web Conference

Can Embeddings Adequately Represent Medical Terminology? New Large-Scale Medical Term Similarity Datasets

Claudia Schulz, Damir Juric

We create various large-scale datasets for testing whether embeddings correctly encode the similarity between medical terms and test existing state-of-the-art embeddings on these datasets. Our results reveal that existing embeddings cannot adequately represent medical terminology. Our new datasets are thus challenging new benchmarks for testing the adequacy of new medical embeddings in the future.

Published in 2020 | AAAI 2020

Multiverse: Causal Reasoning using Importance Sampling in Probabilistic Programming

Yura Perov, Logan Graham, Kostis Gourgoulias, Jonathan G. Richens, Ciarán M. Lee, Adam Baker, Saurabh Johri

The paper describes a probabilistic programming engine design and its analysis for counterfactual probabilistic programming, in general and in particular using importance sampling.

Published in 2019 | AABI 2019

Integrating overlapping datasets using bivariate causal discovery

Anish Dhir and Ciarán M. Lee

Knowing that a disease is highly correlated with symptoms, or a drug highly correlated with recovery, is not enough, and basing medical decisions on such information can be dangerous. To truly begin to revolutionise healthcare, AI must learn to distinguish cause and effect. Our work solves this by utilising new physics-inspired ideas about what it means for one variable to cause another, and showing how causal relationships in one dataset limit the possibilities in other overlapping datasets. To illustrate our algorithm, we apply it to breast cancer data, showing how to extract causal relations between two important features despite the fact that they were never measured in the same dataset.

Published in 2019 | AAAI 2020

Copy, paste, infer: a robust analysis of twin network counterfactual inference

Logan Graham, Ciarán M. Lee, Yura Perov

Provides efficient way to conduct counterfactual simulation, benchmarked against state of the art.

Published in 2019 | NeurIPS Causal Machine Learning workshop

Don't Settle for Average, Go for the Max: Fuzzy Sets and Max-Pooled Word Vectors

Vitalii Zhelezniak, Aleksandar Savkov, April Shen, Francesco Moramarco, Jack Flann, Nils Y. Hammerla

We push the limits of word embeddings on semantic textual similarity tasks by introducing DynaMax, a novel unsupervised non-parametric similarity measure based on word vectors and fuzzy bag-of-words. This method is efficient and easy to implement, yet outperforms current baselines on STS tasks by a large margin.

Published in 2019 | ICLR

Supporting Digital Healthcare Services Using Semantic Web Technologies

Gintaras Barisevičius, Martin Coste, David Geleta, Damir Juric, Mohammad Khodadadi, Giorgos Stoilos, Ilya Zaihrayeu

In this paper we report on our efforts and faced challenges in using Semantic Web technologies for the purposes of supporting healthcare services provided by Babylon Health.

Published in 2018 | ISWC

A Universal Marginalizer for Amortized Inference in Generative Models

Douglas et al.

Published in 2017 | NIPS Workshop, NIPS 2017