There is lots of hype about machine learning models. However, the reality is that they are expensive to build, train, validate and curate. We require data scientists to manage this entire process. The more complex the data becomes, the larger the team of data scientists needs to be. Because of the amount of human intervention required, there is also a time lag. This makes them increasingly atrophied as a business and its market evolves. This is the real bottleneck to machine learning. Even more so when the data involved is text or unstructured data. Introducing HITL.

Data Science and Data Transformation

Many enterprises have invested in big data infrastructure, machine learning and data science teams. However, many have reached a crunch-point where the amount of effort required to gain insight from their data is prohibitive. This is especially true with data scientists cleansing and transforming data.

The primary reason why most companies are analysing less than 1% is that unstructured/text data makes up c. 90% of data. Estimates place the average time spent cleansing and transforming data at 80% (New York Times and Crowdflower).

Imagine a world where machine learning is cheaper and quicker to deploy. We won’t require data scientists to tune and drive models. Imagine how nimble companies could be then.

HITL and Machine Learning Models

Human-in-the-loop (“HITL”) is a branch of artificial intelligence that leverages both human and machine intelligence to create machine learning models. A human-in-the-loop approach involves people in a virtuous circle where they train, tune, and test a particular algorithm or process. It can, however, mean a lot of different things, from simulating with a human observer involved, through to humans ‘labelling’ training sets for machine learning. Focusing on the machine learning use case, whilst universally recognising HITL as the ‘ideal’ solution amongst data scientists, it still carries challenges regarding variability of different human classifiers or ‘labellers’ both in quality and also judgement in certain circumstances. It is also expensive, even when using non-data scientists, and testing and curating the machine learning models happens with a data scientist after each labelling exercise and at regular intervals to check and update the models to prevent them from degrading.

Many HITL situations typically involve outsourcing to third-parties which carries risk and compliance issues (e.g. GDPR) as well as the consistency and retraining points referred above. If there are changes in the business or market, these can result in the data changing and therefore the models will need constant refreshing.

So if the data science community agree that HITL is the optimal solution, how do you optimise HITL itself to address these challenges above?

HITL Optimisation

One example addressing this new approach to HITL is from Warwick Analytics, a spin-out from the University of Warwick. The company has a new technology called Optimized Learning. It can dramatically reduce the requirement for labelling in the first place, minimising human intervention. The product resides in PrediCX, and it is a supercharged ‘labelling engine’. The basic premise is that PrediCX judges its uncertainty. It then invites user intervention to label (aka tag or classify) things manually that it needs to maximise its performance. This in turn leads to minimum human intervention. The human intervention just needs to be someone with domain knowledge and doesn’t need to be a data scientist. The labelling required is ‘just enough’ to achieve the requisite business performance. Also, because it invites human intervention when there’s uncertainty, it can spot new topics. This keeps the models maintained to their requisite performance.

Where there are inconsistencies in labelling, it performs cross-validation and invites intervention. If there are differences in labelling, labels can merge or move around in hierarchies. If the performance at the granular level isn’t high, then it will choose the coarser level just as a human might. It’s human-in-the-loop but the human element keeps to an absolute minimum, and the human doesn’t have to be a data scientist, making it so much more accessible.

Customer Interactions and Time

One example of how this latest technology is being used is a global industrial manufacturing company receiving many thousands of customer interactions per day from engineers working at factories using its products. Most of the communications are accessed through the abundant online resources, although thousands still call the company via multiple channels of telephone and via the webforms which raises the tickets. Time is of the essence, mainly if a manufacturing line stops. This, combined with the vast array of complex products means that the company needs a large number of skilled operators in its contact centre with a full degree of expertise.

The company used PrediCX to accurately classify the incoming queries and match them with the ‘next best response’, be it a resolution, or a further clarification for symptoms or detail of the question to be able to do so. A human operator would validate this before replying back to the customer. Providing accuracy scores of the performance of the labelling enables thresholds to automate (when the confidence is very high) or to guide humans to choose the best response if there are competing options. It is easy to plug into any system via an API.

HITL Labels

As well as dramatically cutting down the time for operators to assist, the labels themselves proved invaluable for the managers of the knowledge base to understand which queries were being asked about most often and indeed it provided ‘early warning’ of new, emerging issues which would not otherwise be picked up by machine learning. This insight was used to enhance and develop the knowledge base accordingly, and also to optimise the online resources, enabling customers to search for the issues in free text, retrieving the appropriate response without the need to contact the contact centre. There was also the benefit that product managers could understand from the labels what was happening to their products if enhancements were needed to usability and reliability, and even to predict which factors were likely to predict an issue to prevent reoccurrence.

In conclusion, whilst recognising that human-in-the-loop is the optimal solution for machine learning, it is fraught with practical problems. There are however technologies appearing which have already been tried and tested and are deployed to great effect using HITL. Truly, man and machine can work together to improve outcomes and maximise the potential of both.