Skip to content
Babylon Health

Data Challenges of building an AI doctor (2/5): Accuracy

Written by Christina Hu

, 4 min read

Data Challenges of building an AI doctor (2/5): Accuracy

This blog is part 2 of a 5 part series. Explore the full series: Part 1  Part 2  Part 3  Part 4  Part 5

Challenge #2: What can we do about members purposefully or unknowingly giving us inaccurate information about themselves?

This is part two of a five part blog series exploring some of the challenges we’ve been facing while building a personal AI doctor.

So far, we’ve been treating all data we gather of equal potential value - the ultimate value of each data-point simply depends on the use case.

But the truth is, all data-points are not created equal.

If we look at data provided directly by members alone, the inherent value of each data-point depends on the self-awareness, health literacy and intent of the member.

Diagram 4: Some examples to show why we can’t always rely on the data-points our members provide

Inaccurate information compromises our ability to give people the care that they need. Our symptom checker would identify the wrong underlying possible causes based on false symptoms; the training of our machine learning models would be misguided by false patient data.

Problem: What can we do about members purposefully or unknowingly giving us inaccurate information about themselves?

First, we must distinguish between data-points with different levels of trustworthiness. There are many ways to do this, with varying degrees of precision. The more precisely we draw the lines, the more effort it takes to implement.

Follow the 80:20 rule and always start simple.

One simple approach is to assign a relative trustworthiness level to each source of data.

Diagram 5: A simple Spectrum of relative trustworthiness of different data sources in Babylon

LESSON 4: Start with a simple “hard-coded” approach to determine how much to trust each data-point.

After separating our data into different buckets based on trustworthiness, we can think about how to treat each bucket differently when we use that data.

Just like with relevance, it all comes down to the goal.

For example, the data we provide for our GP consultations needs maximum accuracy - as a wrong decision could ultimately be life-endangering.

However, inaccurate input to Healthcheck would at worst result in suggestions that aren’t best suited to the individual but still aim to improve their health.

Our tolerance for using potentially inaccurate information depends first and foremost on regulatory requirements surrounding the specific use case. Where regulations don’t specify, our tolerance threshold is determined by the potential impact of making a wrong decision based on inaccurate information. The greater the potential impact, the lower our tolerance.

LESSON 5: Data accuracy tolerance thresholds vary by use case.

Of course, it would be better if we didn’t have to worry about untrustworthy data at all.

Whilst it’s extremely hard to address the problem of members unknowingly providing inaccurate data (they don’t know what they don’t know), there are ways we can tackle the problem of members purposefully providing inaccurate data.

This is where it helps to understand the psychology behind our members’ behaviour.

With the Babylon symptom checker, we know there are two common reasons why members provide false data on purpose: 1) Experimentation and 2) Gaming.

Experimentation: People like to play around with the symptom checker to test out its capabilities or just for fun. To allow members to satisfy this desire, while ensuring it doesn’t interfere with serious usage of Babylon services, we choose not to put a stop to experimentation behaviour but instead find ways to separate experimental vs genuine interactions. One way we do this for our NHS members is by simply asking them if they’d really like us to add what they just told us to their NHS health record.

Gaming: Some people want to force the symptom checker to give a certain output. They do this because they perceive the output to be rewarding in some way. So we reduce gaming by decoupling the “reward” from the pattern of actions that is (perceived to be) associated with the “reward”.

LESSON 6: Understand members’ motives to incentivise them to provide accurate data to the best of their abilities.

This blog is part 2 of a 5 part series. Explore the full series: Part 1  Part 2  Part 3  Part 4  Part 5

The information provided is for educational purposes only and is not intended to be a substitute for professional medical advice, diagnosis, or treatment. Seek the advice of a doctor with any questions you may have regarding a medical condition. Never delay seeking or disregard professional medical advice because of something you have read here.

Ready for better healthcare?

To unlock Babylon video appointments, download the app and register using the Babylon code provided by your health insurance. If you don't receive a Babylon code through your insurance, access our Symptom Checker and Healthcheck for free.

Download on the App StorePlayStore icon