Our technology

Babylon’s Peer-Reviewed Research

Neural Temporal Point Processes for Modelling Electronic Health Records

Joseph Enguehard, Dan Busbridge, Adam Bozson, Claire Woodcock, Nils Hammerla

To address their temporal nature, we treat EHRs as samples generated by a Temporal Point Process (TPP), enabling us to model what happened in an event with when it happened in a principled way. Our proposed attention-based Neural TPP performs favourably compared to existing models, and provides insight into how it models the EHR, an important step towards a component of clinical decision support systems.

Published in 2020 | ML for Health Workshop, NeurIPS 2020

A comparison of artificial intelligence and human doctors for the purpose of triage and diagnosis

Adam Baker, Yura Perov, Katherine Middleton, Janie Baxter, Daniel Mullarkey, Davinder Sangar, Mobasher Butt, Arnold DoRosario

We performed a validation study of the accuracy and safety of the babylon AI triage system and human doctors using a set of identical clinical cases. Overall, we found that the AI system is able to provide patients with triage and diagnostic information with a level of clinical accuracy and safety comparable to that of human doctors.

Published in 2020 | Frontiers in Artificial Intelligence Medicine and Public Health

Applying Artificial Intelligence Methods For The Estimation Of Disease Incidence: The Utility Of Language Models

Yuanzhao Zhang, Robert Walecki, Joanne Winter, Felix Bragman, Sara Lourenco, Chris Hart, Adam Baker, Yura Perov and Saurabh Johri

AI-driven digital health tools often rely on estimates of disease incidence or prevalence, but obtaining these estimates is costly and time-consuming. We demonstrate that context-aware machine learning models can be used for estimating disease incidence. These methods are quicker to implement than traditional epidemiological approaches. We therefore suggest it complements existing modelling efforts, where data is required more rapidly or at larger scale. This may particularly benefit AI-driven digital health products where the data will undergo further processing and a validated approximation of the disease incidence is adequate.

Published in 2020 | Frontiers in Artificial Intelligence Medicine and Public Health

Biomedical Concept Relatedness – A large EHR-based benchmark

Claudia Schulz, Josh Levy-Kramer, Camille Van Assel, Miklos Kepes and Nils Hammerla

A promising application of AI to healthcare is the retrieval of information from electronic health records (EHRs), e.g. to aid clinicians in finding relevant information for a consultation or to recruit suitable patients for a study. This requires search capabilities for beyond simple string matching, including the retrieval of medical concepts (diagnoses, symptoms, meditations, etc) related to the one in question. We open-source a novel medical concept relatedness benchmark, which is six times larger than existing datasets and consists of concept pairs that co-occurr in EHRs, ensuring their relevance for medical information retrieval from EHRs.

Published in 2020 | Coling 2020

Improving the accuracy of medical diagnosis with causal machine learning

Jonathan G. Richens, Ciarán M. Lee & Saurabh Johri

Machine learning promises to revolutionize clinical decision making and diagnosis. In medical diagnosis a doctor aims to explain a patient’s symptoms by determining the diseases causing them. However, existing machine learning approaches to diagnosis are purely associative, identifying diseases that are strongly correlated with a patients symptoms. In this paper we show that this inability to disentangle correlation from causation can result in sub-optimal or dangerous diagnoses. To overcome this, we reformulate diagnosis as a counterfactual inference task and derive counterfactual diagnostic algorithms. We compare our counterfactual algorithms to the standard associative algorithm and 44 doctors using a test set of clinical vignettes. While the associative algorithm achieves an accuracy placing in the top 48% of doctors in our cohort, our counterfactual algorithm places in the top 25% of doctors, achieving expert clinical accuracy. Our results show that causal reasoning is a vital missing ingredient for applying machine learning to medical diagnosis.

Published in 2020 | Nature Communications

Estimating Mutual Information Between Dense Word Embeddings

Vitalii Zhelezniak, Aleksandar Savkov, April Shen, Nils Hammerla

Some of the top approaches to semantic textual similarity rely on various correlations between word embeddings, including the famous cosine similarity. We show that mutual information between dense word embeddings, despite being difficult to estimate, is another excellent candidate for semantic similarity and rivals existing state-of-the-art unsupervised methods.

Published in 2020 | ACL Journal

Hybrid Reasoning Over Large Knowledge Bases Using On-The-Fly Knowledge Extraction

Stoilos, Giorgos and Juric, Damir and Wartak, Szymon and Schulz, Claudia and Khodadadi, Mohammad

The success of logic-based methods for comparing entities heavily depends on the axioms that have been described for them in the Knowledge Base (KB). Due to the incompleteness of even large and well engineered KBs, such methods suffer from low recall when applied in real-world use cases. To address this, we designed a reasoning framework that combines logic-based subsumption with statistical methods for on-the-fly knowledge extraction.

Published in 2020 | European Semantic Web Conference

Can Embeddings Adequately Represent Medical Terminology? New Large-Scale Medical Term Similarity Datasets

Claudia Schulz, Damir Juric

We create various large-scale datasets for testing whether embeddings correctly encode the similarity between medical terms and test existing state-of-the-art embeddings on these datasets. Our results reveal that existing embeddings cannot adequately represent medical terminology. Our new datasets are thus challenging new benchmarks for testing the adequacy of new medical embeddings in the future.

Published in 2020 | AAAI 2020

An Ontology-Based Interactive System for Understanding User Queries

Stoilos, Giorgos and Wartak, Szymon and Juric, Damir and Moore, Jonathan and Khodadadi, Mohammad

In the current paper we present a framework for automatically building a small dialogue for the purposes of bridging the gap between user queries and a set of pre-defined (target) ontology concepts. We show how we can use the ontology and statistical techniques to select an initial small set of candidate concepts from the target ones and how these can then be grouped into categories using their properties in the ontology.

Published in 2019 | European Semantic Web Conference

Multiverse: Causal Reasoning using Importance Sampling in Probabilistic Programming

Yura Perov, Logan Graham, Kostis Gourgoulias, Jonathan G. Richens, Ciarán M. Lee, Adam Baker, Saurabh Johri

The paper describes a probabilistic programming engine design and its analysis for counterfactual probabilistic programming, in general and in particular using importance sampling.

Published in 2019 | AABI 2019

Masking schemes for universal marginalisers

Gourgoulias, Kostis; Lomeli, Maria; Thompson, Daniel; Gautam, Divya

In this paper we study generative models which mimic reasoning under partially observed evidence scenarios and take decisions about which disease is more likely for the patient. Specifically, we explore different generative models in terms of learning efficiency.

Published in 2019 | AABI 2019

Integrating overlapping datasets using bivariate causal discovery

Anish Dhir and Ciarán M. Lee

Knowing that a disease is highly correlated with symptoms, or a drug highly correlated with recovery, is not enough, and basing medical decisions on such information can be dangerous. To truly begin to revolutionise healthcare, AI must learn to distinguish cause and effect. Our work solves this by utilising new physics-inspired ideas about what it means for one variable to cause another, and showing how causal relationships in one dataset limit the possibilities in other overlapping datasets. To illustrate our algorithm, we apply it to breast cancer data, showing how to extract causal relations between two important features despite the fact that they were never measured in the same dataset.

Published in 2019 | AAAI 2020

A System for Medical Information Extraction and Verification From Unstructured Text

Damir Juric, Giorgos Stoilos, Andre Melo, Jonathan Moore and Mohammad Khodadadi

A wealth of medical knowledge has been encoded in terminologies like SNOMED CT, NCI, FMA, and more. However, these resources are usually lacking information like relations between diseases, symptoms, and risk factors preventing their use in diagnostic or other decision making applications. In this paper we presented a pipeline for extracting such information from unstructured text and enriching medical knowledge bases.

Published in 2019 | IAAI 2020 at AAAI 2020

Correlations between Word Vector Sets

Vitali Zheleniak, April Shen, Daniel Busbridge, Aleksandar Savkov, Nils Hammerla

We interpret word similarity as correlations between word embeddings and generalise this view to the sentence-level similarity by considering either vector pooling or multivariate correlation coefficients. Both approaches rival state-of-the-art methods on standard semantic textual similarity benchmarks.

Published in 2019 | EMNLP-IJCNLP

Copy, paste, infer: a robust analysis of twin network counterfactual inference

Logan Graham, Ciarán M. Lee, Yura Perov

Provides efficient way to conduct counterfactual simulation, benchmarked against state of the art.

Published in 2019 | NeurIPS Causal Machine Learning workshop

Multilingual Factor Analysis

Vargas et al.

In this work we approach the task of learning multilingual word representations in an offline manner by fitting a generative latent variable model to a multilingual dictionary. We model equivalent words in different languages as different views of the same word generated by a common latent variable representing their latent lexical meaning.

Published in 2019 | ACL

Don't Settle for Average, Go for the Max: Fuzzy Sets and Max-Pooled Word Vectors

Vitalii Zhelezniak, Aleksandar Savkov, April Shen, Francesco Moramarco, Jack Flann, Nils Y. Hammerla

We push the limits of word embeddings on semantic textual similarity tasks by introducing DynaMax, a novel unsupervised non-parametric similarity measure based on word vectors and fuzzy bag-of-words. This method is efficient and easy to implement, yet outperforms current baselines on STS tasks by a large margin.

Published in 2019 | ICLR

Model Comparison for Semantic Grouping

Vargas et al.

We introduce a probabilistic framework for quantifying the semantic similarity between two groups of embeddings. We formulate the task of semantic similarity as a model comparison task in which we contrast a generative model which jointly models two sentences versus one that does not. We illustrate how this framework can be used for the Semantic Textual Similarity tasks using clear assumptions about how the embeddings of words are generated.

Published in 2019 | ICML

Correlation Coefficients and Semantic Textual Similarity

Vitalii Zhelezniak, Aleksandar Savkov, April Shen, Nils Hammerla

We introduce a novel statistical view on semantic textual similarity and cast it as correlations between word embeddings. We study the statistics of popular embedding models and show that simple word embeddings together with rank correlations can easily rival the strongest deep representations on semantic textual similarity tasks.

Published in 2019 | NAACL-HLT

Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks

Vitalii Zhelezniak, Dan Busbridge, April Shen, Samuel L. Smith, Nils Y. Hammerla

Intriguingly, simple models outperform complex deep networks on many unsupervised text similarity tasks. We provide an intuitive yet rigorous explanation for this behaviour by introducing the concept of an optimal representation space, in which similarity is induced by the model's objective function.

Published in 2018 | ICLR Workshop

A Novel Approach and Practical Algorithms for Ontology Integration

Giorgos Stoilos, David Geleta, Jetendr Shamdasani and Mohammad Khodadadi

In this paper we present a framework and novel approach for integrating independently developed ontologies. Starting from an initial seed ontology which may already be in use by an application, new sources are used to iteratively enrich and extend the seed one. To deal with structural incompatibilities we present a novel fine-grained approach which is based on mapping repair and alignment conservativity, formalise it and provide an exact as well as approximate but practical algorithms.

Published in 2018 | ISWC

Supporting Digital Healthcare Services Using Semantic Web Technologies

Gintaras Barisevičius, Martin Coste, David Geleta, Damir Juric, Mohammad Khodadadi, Giorgos Stoilos, Ilya Zaihrayeu

In this paper we report on our efforts and faced challenges in using Semantic Web technologies for the purposes of supporting healthcare services provided by Babylon Health.

Published in 2018 | ISWC

Reasoning with Textual Queries: A Case of Medical Text

Damir Juric, Giorgos Stoilos, Szymon Wartak, Mohammad Khodadadi

Published in 2018 | ISWC

Methods and Metrics for Knowledge Base Engineering and Integration

Stoilos et al.

In this paper we investigated the possibility of integrating different and largely heterogeneous biomedical ontologies. We reported on our Knowledge Base construction pipeline which is based on ontology integration and focused on the various metrics, techniques, and tools we have developed in order to assist in achieving this large-scale integration task.

Published in 2018 | WOP

Medical Knowledge Graph Construction by Aligning Large Biomedical Datasets

Stoilos et al.

Building large Knowledge Bases can be realised by aligning and integrating existing data sources. To support AI-based digital healthcare services within Babylon Health significant effort to build a large medical KB was recently undertaken. To realise this goal a highly configurable and modular ontology integration pipeline has been created that contains three phases: a Matching phase, an Aggregation phrase, and a final PostProcessing phase.

Published in 2018 | OM

Offline bilingual word vectors, orthogonal transformations and the inverted soft-max

Smith et al.

Pre-trained word embeddings can be aligned with a linear transformation, using dictionaries compiled from expert knowledge. In this work, we prove that the linear transformation between two embedding spaces should be orthogonal, and it can be obtained using the singular value decomposition. We also introduce a novel “inverted softmax” for identifying translation pairs to improve on precision @1 mapping from previous work.

Published in 2017 | ICLR