We hear a lot about AI (Artificial Intelligence) getting increasingly more powerful and taking over the world. And it’s true that AI is getting better, but it’s still very domain-specific and actually quite brittle.
And potentially, rather error prone. Computers are only ever as good as what we give them to work on – GIGO (Garbage In, Garbage Out), after all. Consider all these sexist AIs out there: US researchers trained an algorithm on a set of photos of people in kitchens – where it so happened more women were visible. Over 100,000 images from the Internet were looked at – and its assumption that only women who appear in a domestic setting became stronger, amplifying some very questionable assumptions. A similar problem cropped up for Amazon, which had to turn off a robot recruiter that had been trained on too many male CVs so was biased against women applying for technical jobs at the firm.
Clearly, we need smarter AI to avoid such social issues, and the missed business opportunities they represent. Just throwing more hardware at the problem, however, can’t be the only answer to optimise for AI. Computers can’t understand context, for one, and you still need to be able to fit all of your data into storage in order for it to be able to run any useful AI calculation.
The problem is how do you get the right data when it misses one crucial axis – relationships? When you think about it, real business insight is based on how things are connected. The good news is that graph technology – whose sine qua non is working with relationships – can be a real boon here, as there are lots of problems where graphs can help feed the right data into your AI software.
As a result, there are lots of ways where graph databases can start speeding AI work up – a lot; not by just throwing more compute power at the problem, but throwing the right compute power at it and by introducing an optimised-for-relationship data structure.
The power of connections
Plus there is something amiss with Machine Learning. Too often we’re training these models to predict things but via a misleading, unconnected view of the world. We are missing nuance – how things are related. As James Fowler, author of Connected, notes, if I want to predict whether you are a smoker or not, I can either get all the facts about you, namely name, age, medical history, demographics, etc. – or I can get to know whether the people in your social graph are already smokers. Looking at those two options, I will be able to predict with a much higher degree of confidence that you are (or will be) a smoker with a social graph. Astonishingly, this is true when you are looking at not just your direct friends, or friends of friends but friends three hops out – friends of friends of friends.
If social graph can do that, imagine what it can do when working in lock-step with AI. There are two real concrete examples I can give: the first is if you’ve got a very large dataset and you want to ask it a simple question, Have I seen a customer like this before and what did they do next?
To establish that by brute force computation would mean having to look at every customer you’ve ever seen, and you might have tens or hundreds of millions of customers. But that’s not how your brain would make that decision. Your brain would leverage context and filter down to deliver back, of the million customers you’ve seen before, here are the hundred that are the most similar.
Only then would you start reasoning to say, “she was looking for something similar and this is the decision she made, and these people tend to do the same.” You’re taking that process of filtering down using context to reach the right subset to analyse against – a difficult calculation to make in the AI space.
This ability is what’s called collaborative filtering, and it can cluster your search work in a far more intuitive and performant manner than alternatives, and in order to accelerate your AI/ML project. My next example is how graph databases can filter by helping you avoid trying to pack everything into a giant matrix. Graphs create positive relationship values, which means you can easily reach other people or objects that share these relationships, and in a far faster way than you can with conventional column and row data structures.
This allows you to move your AI/ML endeavours from a batch process – let me have my computer just crunch all these calculations for a week – to something far more real-time. You can then start looking at combinations of things and see when they relate to each other and the types of relationships they have to each other, so that you can start to generate real understanding.
Given all these advantages, it’s clear that graph technology could be the missing link you’re not taking advantage of that can better help you better store and understand your AI training data so that you can answer that amazing next question – the one you don’t know that you have yet.