Data Challenges of building an AI doctor (3/5): Building the Full Picture
Written by Christina Hu
, 4 min read

This blog is part 3 of a 5 part series. Explore the full series: Part 1 Part 2 Part 3 Part 4 Part 5
Challenge #3: How do we get a holistic view of our members’ health when not all their healthcare happens through Babylon?
This is part three of a five part blog series exploring some of the challenges we’ve been facing while building a personal AI doctor.
Up until now, we’ve focused on data that comes to us from direct interactions with the Babylon platform.
That includes largely: 1) Data provided by members about themselves, and 2) Data provided by Babylon clinicians about members.
That’s a lot of insight in there, but doesn’t give us the full picture of our members’ health.
For example, many people use Babylon as a private service alongside their regular GP. Our doctors have told us that without seeing these patients’ NHS health records, they’re uncomfortable making certain clinical decisions, like issuing repeat prescriptions.
Without a complete view of their health, the care we can give our members is limited.
Problem: How do we get a holistic view of our members’ health when not all their healthcare happens through Babylon?
Short answer: we bring in data from other sources.
The slightly longer answer involves understanding what data to bring in and how to make sense of data from different systems - all whilst making sure our members stay in control of their data.
We need data that complements what we already receive through the Babylon platform and is relevant to our mission.
Let’s take an example from each side of the Circle of Care:
- Sickcare: Babylon clinicians need to understand a private patient’s medical history thoroughly to prescribe repeat medications. Some of the patient’s medical history is found in their NHS health record. So we must provide our clinicians with a view of these patients’ NHS health records
- Healthcare: To help people stick to their health plans, we need to track their progress on a regular, continuous basis. Many people use apps and wearables to monitor their sleep, nutrition, activity and other lifestyle measures. So we need to access the data collected by these apps and wearables

Diagram 6: Bringing in different types of data from relevant external sources is key to us building up a complete view of our members’ health
These examples also illustrate another important point - that there are two types of data we value:
- Data that directly offers medical information about the member (e.g. NHS health records)
- Data that doesn’t inherently tell us anything about the member’s health, but can be processed and interpreted to give a wealth of medical insight (e.g. step-count recorded by wearables)
LESSON 7: Seek out both relevant data that explicitly encodes medical insight and that which can be processed to generate insight.
So we’ve worked out what data we need and where to get it from. Next, we just connect their systems to ours and the data comes pouring in.
If only it were that easy.
In an ideal world, data would always come to us in standard structures and formats.
Reality reveals quite the opposite.
Despite decades of collective global effort2, medical information is described using many different coding systems - ICD-10, Read2, SNOMED CT - to name just a few!
There are standards around like FHIR, but in practice they’re frequently not followed. So more often than not we need to do the translating between coding systems ourselves.
With the ability to describe almost 350,000 unique medical concepts3, you may think that SNOMED is pretty comprehensive. But it’s not nearly enough to describe all the medical conditions we see on the Babylon platform.
That’s why we’ve made our own lingua franca which is more expressive than any medical coding system that exists currently - more on that in the section on Single Medical Language...
LESSON 8: Factor in plenty of time and resources to overcome the challenge of interoperability.
All of that presents a lot to think about already, so it can be easy to overlook what’s most important: our members.
Remember that the whole point of going through the pains of bringing in all this data is to give our members the most safe, effective and convenient care we can - all within an environment of implicit trust. Trust that we always have our members’ interests at front of mind. Trust that members have full transparency and control over their data every step of the way4. And that includes Babylon “forgetting” their data if that’s what they want.
The capabilities that support this need to be built in before we even think about tackling data integration challenges.
LESSON 9: Before bringing in new user data, create a relationship of trust with users that is built on transparency and control.
This blog is part 3 of a 5 part series. Explore the full series: Part 1 Part 2 Part 3 Part 4 Part 5
2. Mapping between the code systems of different countries: a case study https://www.hddaccess.com/tips/mapping-between-the-code-systems-of-different-countries-a-case-study-cci-to-icd-10-pcs.
3. https://bioportal.bioontology.org/ontologies/SNOMEDCT
4. Here’s how Babylon creates an environment of trust with our members: https://www.babylonhealth.com/...
The information provided is for educational purposes only and is not intended to be a substitute for professional medical advice, diagnosis, or treatment. Seek the advice of a doctor with any questions you may have regarding a medical condition. Never delay seeking or disregard professional medical advice because of something you have read here.