When most people think about the increasing sophistication of artificial intelligence (AI), they tend to focus on algorithmic improvements, increased computing power or breakthroughs by technologists.

Whilst these are fundamental to progress, what is often forgotten is the need to collect and store huge troves of data. We would never have seen Google’s AlphaGo beat Go master Lee Se-dol or Facebook’s poker bot best some of the world’s top Texas Hold’em players were it not for contemporary data capture and storage solutions.

Data is the fundamental resource and utility that is driving innovation in AI, with its progress inextricably tied to the ability to effectively store data.  The value of AI in countless applications and the cost savings it will deliver will drive much greater adoption.

AI developments will result in labour savings of over $40 billion in federal, state, and local government, according to a study by Deloitte. And once we apply this to private industry, we’re looking at savings of more than $5 trillion dollars per year.

It’s certainly an exciting prospect, but if industries are to break through as quickly as they’d like to, there needs to be a way for all of that data to be captured and harnessed.

Understanding the resources AI needs

There’s a reason why companies such as Facebook and Google are at the forefront of this nascent technology. Yes, partly because of their ability to attract the best talent and afford the latest tools, but most importantly because of the sheer volume of their datasets.

Google receives a staggering two trillion searches a year, while Facebook collects and collates multiple data points every second from each of its 2.2 billion active users. AI uses these databases to ‘learn’ and find patterns, becoming increasingly smarter as it digests more and more data.

Just as a skyscraper requires cement, or a human mind requires books, so too does AI require data to function. We are at a tipping point, and once we start to reduce the costs associated with storing data, we will see the technology improve at a dramatic pace.

How driverless cars can avoid hitting a brick wall

One of the most pertinent examples of AI taking shape is in the automotive industry. It’s experiencing radical change as autonomous vehicles (AVs) start to become a reality. The likes of Tesla and Google X have been spearheading its development, but traditional incumbents are also starting to invest heavily into the technology – one example being BMW’s recent deal with Daimler – to push their business towards driverless cars.

Driverless vehicles such as Waymo generate over 11 terabytes of data a day, though they currently only use AI in a limited capacity. A lack of data is one of the reasons progress has been stifled, rather than problems with the engineering teams themselves.

Today, driverless vehicles are trained to gather a targeted portion of data from their surrounding environment. This is to conserve storage which is finite, meaning algorithms only collect a subset of an AV’s context.

This is not the ideal way to train AI. The best way to train AI in an AV is in the same way we train humans to drive: holistically capture all the information that is available in the environment. However, doing so would mean increasing the data that needs to be collected stored dramatically, from 11 TB a day to around 200 TB a day.

What this means in practice is that a single vehicle would need to store 73 petabytes of data every year, which would cost roughly $21 million to store in the cloud.

This may be viable for a small number of flagship or proof of concept vehicles, but makes scaling across millions of vehicles impossible. If we can reduce the costs associated with cloud storage, we can effectively put autonomous vehicles on the market.

It’s not just driverless cars making leaps in AI

We’re seeing AI and machine learning solutions emerge across countless industries – post-production in the media offers another good example.

The possibilities for studios and production teams will become seemingly limitless once AI driven tools are implemented. For example, the ability to search an entire content library based on specific scenes, words spoken or even characters will be instantaneous. Production teams will be able to collaborate on content in real time in disparate locations, fostering dynamic development processes and making the production process far more efficient.

To achieve the faster data flow required to enable this progress, the industry is increasingly moving away from on-premise storage and to the cloud. Currently 5,000 petabytes are stored in the cloud in this space, but this is expected to grow to more than 130,000 petabytes by 2021. To facilitate the huge data influx, moving to the cloud is a no brainer.

AI has the capacity and potential to disrupt almost every single industry. No matter which industry we’re discussing, the proliferation of data required to make it a reality and the need for faster and cheaper access to it is the true enabler. This explains why cloud storage services that democratise data storage are vital for any company looking to take that next step in their technology.

The good news is that cloud storage is exploding as an industry, and many solutions have cropped up in recent years that are more cost-effective than those companies have been tied to historically. Specialised cloud storage providers like ours are tackling issues of scalability by dramatically reducing the costs associated with storing data.

If we want to focus on algorithmic breakthroughs and drive forward innovation in AI, we’re looking at data, data and more data. For companies that learn how to harness that data so it can be captured and analysed effectively, moving to the cloud is really the only viable option. For those that understand this, there’s a lot to be gained.