David Fearne, Technical Director at Arrow, talks about the company’s recent social and technological experiment: How Happy Is London?
So, first things first, what exactly is How Happy Is London?
How Happy Is London? (HHIL) is a demonstration of how the intelligence and insight uncovered by large-scale data analytics can lead to better business decision-making. The project shows what can happen when you take billions of pieces of data from unconnected sources in the public domain, integrate them, analyse them and transform the results to give meaningful answers.
In this case, we wanted to answer the question: How Happy is London? Everyone knows London and everyone understands the concept of happiness – so, choosing to measure the happiness of London is a topic that everyone can relate to. The final output is an up to the minute picture of the city’s mood – which is refreshed from new data every 60 seconds. The data is represented online as a series of images and a ‘Happiness Indicator’ ranking, to show where the present mood in London is – between ‘Business as usual’ and ‘On top of the world’ at any point in time.
How does it work?
The system is run in line with an architectural model for manipulating, storing and exploring data. For instance, we have a layer for ingestion, which takes all the data from its sources and brings into an ETL (Extract, Transform, Load) phase – which enriches the data. Then computation and algorithm happen. The data is then stored in structured and unstructured data platforms and presented to the various users via a combination of Apache NiFi and API connectivity.
We explore and visualise the data using the website, an installation video screen in our London office and a live twitter feed. And finally, through an API that lets people interact with the data and utilise it in their own applications.
What is the data volume being processed?
We process 2.6 billion data points every single day for three core areas – social, transport and weather. For weather, we pull about 36,000 lines every minute. For traffic on London’s roads, every single time we query the API it’s 180,000 lines of data. We calculate every single section of traffic to understand it, every longitude and latitude coordinate – and if a road bends a lot we get a lot of coordinates.
What is the methodology behind it?
Before we store the data, we refine it and dispose of anything we don’t need. For example, we take the 180,000 lines of road data and refine it down to 3,600 lines; this quantifies if a road is running clear or blocked.
How do you ensure data security?
In general, we don’t need to because all the data is open data. We don’t encrypt the data as it’s not sensitive, however, we do apply a certain amount of access security. For example, we have key-based authentication and encryption when people access the data remotely.
Is the algorithm secret?
No, it’s not secret. As far as selling HHIL again the whole architecture is valueless as it’s so custom to how happy London is. However, it’s open to everyone as we want people to adopt it, leverage it and build their own ‘How Happy Is’ for their company. The results are presented as a REST API under an open data license, so any third party can register and take advantage of the Happiness Index for their own applications.
Can other companies publish this data?
Yes, we consume open data so it seems only right to provide open data. They’d have to mention that the data had come from us somewhere but anyone can use it.
Who could find this useful?
Anyone who is looking to do something more insightful with their data or their customers’ data – or to become a data-driven organisation. There is an incredibly wide customer base as analytics is going to be a big market over the next two-three years; especially as the competitive edge becomes thinner and thinner, making data-driven decisions essential to gaining a competitive edge.
[easy-tweet tweet=”HHIL is the simplest way of using analytics; taking numerous data points and combining it together” hashtags=”Data, Technology”]
Will the project continue and evolve or will it be switched off?
HHIL will grow over time. The whole point of the project is to help customers who say “I have no idea what analytics is” and use this as their first foot on the ladder. This is the simplest possible way of using analytics; taking numerous data points and combining it together to understand how well your business is doing.
Phase two of the project is going to be about predictive analytics. Based on what we know happened this time last year so we can see what might happen next year. We will be able to predict how happy London is tomorrow. This fits with the natural progression of the adoption of analytics in the industry; as our partners are adopting more complex technologies their end customers are becoming more confident and more reliant on the technology. We want to build HHIL to accommodate the growing interest in more complex analytics.
Phase three will be focused on machine learning and artificial intelligence. We know how happy London is today and have predicted how happy London will be tomorrow – but how can we change the outcome of the happiness of London? In a business context, you know how well your business is doing today, we can predict how well it’s doing tomorrow – but what if we can suggest ways to increase the success of your business?
It’s scary stuff but very doable using machine learning algorithms and artificial intelligence – and is the ultimate goal of the How Happy Is London? project.
Any more future plans?
Watch this space for other How Happy is…? projects in major cities across the globe.