The 7 Deadly Sins of Data Loss Prevention in the Cloud

The world of work is changing. More people are working remotely and more data is being stored and transferred using cloud based apps, leading to enterprises employing a hybrid of on-premises and cloud-based applications.

This creates a fantastic opportunity for identifying cost savings, as well as utilising talent no matter where that sits. However, it also opens up significant security shortcomings when traditional Data Loss Prevention (DLP) solutions are deployed. In particular, as organisations move more of their sensitive data onto SaaS providers’ servers, traditional DLP solutions are unable to provide sufficient visibility into the SaaS environment, or worse yet, are not able to operate in the “as a service” environment at all.

[easy-tweet tweet=”Traditional #DLP solutions are unable to provide sufficient visibility into the #SaaS environment” user=”comparethecloud” usehashtags=”no”]

Let’s take a look at seven challenges of providing DLP for cloud file sharing applications.

1. Lacking basic visibility

The most obvious shortcoming of traditional DLP is that it can only monitor traffic on enterprise-controlled assets. However, traffic to and from a cloud application might not go over an enterprise network at all. It could be generated by a mobile user…through a native mobile application…over a mobile network. This falls out of the sight of traditional on-premises enterprise solutions, and as such, is fundamentally beyond the scope of what classic DLP solutions are designed to handle.

2. Failing to interpret encrypted traffic

Traffic to and from cloud applications is typically encrypted. Even if a traditional DLP solution managed to gain network-level visibility into this traffic, it might not be able to interpret the underlying content. Again, without basic visibility, there is little that can be accomplished by traditional solutions.

3. Interpreting links versus raw data

Traditional DLP solutions are predicated on processing raw data directly. However data is never being directly shared in cloud file sharing applications. Instead, what is being shared is some type of link to the content. The link itself reveals little to no useful information about the content itself. What must be done is to analyse the content being pointed to by the link, rather than the link itself – again something traditional DLP solutions can’t do.

[easy-tweet tweet=”Traditional #DLP solutions are predicated on processing raw #data directly.” user=”ElasticaInc and @comparethecloud” usehashtags=”no”]

4. Using ‘perimeter defence’ sharing semantics

In the context of traditional enterprise environments, data loss or leakage has a well-defined meaning – namely the crossing of data across the enterprise perimeter. For cloud file sharing applications, however, the definition of leakage or loss is fundamentally different for two reasons. First, once data is hosted with a cloud provider, it already resides outside the enterprise network and can be shared with external third parties. Secondly, data in a cloud application is shared on a per-user basis. For example, if you want to share a file with someone else, you can typically do so by simply entering that person’s email address or the username they use for the application. Traditional DLP solutions do not understand these sharing semantics, and cannot assess if data is being “lost” or leaked.

Traditional DLP solutions do not understand these sharing semantics

5. Applying algorithms not designed for file-based data

Traditional DLP technologies might make different assumptions regarding the data they have to process – they may assume data is transmitted in a stream and has to be processed as such. When dealing with SaaS-based file sharing applications, the data model generally involves being able to access entire files containing sensitive data. Algorithms that are designed for streaming data might not perform well on file-based data. As a result, to achieve optimal performance for SaaS-based enterprise file sharing applications, it is important to develop algorithms that were designed specifically to take advantage of full-file content.

6. Viewing content myopically, while ignoring broader context

Traditional DLP solutions might examine a piece of content in isolation and use that as the sole basis for determining whether or not the transmission of that content represents a violation. For DLP in cloud applications, we have access to much richer context about a particular file – has the file only been shared internally or is it being shared externally too? Who originated the sharing of the file? Was it an external party or did it come from the inside?

[easy-tweet tweet=”Traditional DLP solutions are not privy to the mechanics of cloud-based file sharing applications” via=”no” hashtags=”cloud”]

Considering context is not just important for determining whether a policy is violated, but it is also important when remediating issues. You might be fine with a SaaS application hosting a file containing specific content, as long as the users with whom the file is shared are internal to the organisation. If an attempt is made to share the content with an unauthorised third party, then you might want to block only this type of access. Because traditional DLP solutions are not privy to the mechanics of cloud-based file sharing applications, they are unable to provide enforcement capabilities that are consistent with the way these applications work.

7. Relying on pattern matching

Traditional enterprise DLP technologies rely primarily on basic pattern matching and regular expressions for identifying sensitive content. To identify a credit card number, the DLP solution might look for sixteen numbers formatted in the particular way. While this approach will be highly sensitive to finding credit card numbers, it will have poor specificity. In particular, there may be many instances of files containing digits that can be misconstrued as credit card numbers. To address this concern, it is important to apply techniques from natural language processing and machine learning. These approaches go beyond simply trying to understand the raw content, and instead focus on being able to understand the underlying context. For example, the presence of a sixteen-digit number is itself vague. If, however, in proximity to that number we see a name, an address, and a date – then we can have more confidence that we are dealing with sensitive financial information.

a new approach to data loss prevention is vital to ensure security

These seven challenges highlight that data loss prevention in the new world of work needs to be redefined. Looking at DLP for SaaS applications is starkly different from what needs to be done for traditional on-premises enterprise applications. Clearly, for forward thinking organisations that now rely on a hybrid of on-premises and cloud-based applications, a new approach to data loss prevention is vital to ensure security.

+ posts

CIF Presents TWF – Professor Sue Black


Related articles

How Businesses Should Tackle Big Data Challenges

In today's data-driven landscape, Big Data plays a pivotal...

UK IP Benefits and How to Get One

There are many reasons why you may get a...

Navigating the Landscape of AI Adoption in Business

In today's rapidly evolving technological landscape, the integration of...

Three Ways to Strengthen API Security

APIs (Application Programming Interfaces) are a critical driver of...

A Comprehensive Guide To The Cloud Native Database [2024]

Databases are crucial for storing and managing important information....

Subscribe to our Newsletter