Hybrid cloud implementations have grown in number as companies want to speed up their application deployments. Nearly 75 per cent of large enterprises planned to have hybrid cloud deployments in place by the end of 2015 according to Gartner. Moreover, the amount of data stored in hybrid cloud deployments is expected to double in the next two years.

[easy-tweet tweet=”Nearly 75% of large enterprises planned to use hybrid #cloud deployments by the end of 2015″ hashtags=”Gartner”]

Hybrid cloud deployments are becoming more popular as IT teams want to get the flexibility of cloud while also making use of their internal IT skills and assets. By using a mix of on-premise IT, private cloud and third-party services, IT can deliver applications that can scale alongside the needs of the business.

Behind this wider goal, hybrid cloud implementations can support company IT teams in being more flexible compared to deploying internal IT or using public cloud services on their own. Options for hybrid cloud include the ability to expand infrastructure resources only when required to meet peak demand, the option to outsource the infrastructure management and responsibility to a third party, as well as more tactical deployments for sourcing secondary storage for disaster prevention.

However, running database implementations as part of these hybrid cloud deployments can present challenges. These fall into three categories:

#1: How Simple Is It to Manage?

The first thing to consider when running a database in a hybrid cloud configuration is how easy it will be to have the database run across multiple locations. If the IT team wants to base the application or service on a single database, then the technical architecture of that database has to be able to run across multiple locations without running into any difficulties in performance or heavy lifting to make this work.

At this point, it’s important to look at the underlying architecture of any database that is considered. Master-slave implementations are used for relational databases where one instance is in control of all data reads/writes across a number of “slave” nodes that are used primarily for read operations. In contrast, a masterless database is one where every node is the same; all can service read and write activities.

For hybrid cloud deployments, master-slave implementations will almost always have parts of the cluster that are devoted to different activities and functions. For example, some parts of the cluster will handle write operations, while others only handle reads or are marked as failover-only. This can make it more difficult to manage the distribution of data over wide geographic areas.

A masterless database takes the opposite approach –  any node within the cluster can take on any operation at any time. This approach can be better suited to hybrid cloud deployments where the number of nodes in the cluster can go up or down in response to demand. At the same time, this approach can service applications that are geographically dispersed, as the nearest node in the cluster can handle all operations rather than needing requests to go back and forth to a master node.

One of the biggest concerns is how to smartly move data between on-premise and cloud providers to support any service. The use of a masterless architecture with flexible replication capabilities that enable read/write anywhere independence, the ability to selectively replicate data between on-premise and cloud providers, and keep data synchronised in all locations greatly simplifies hybrid cloud database management. 

Lastly, management and monitoring tools used should be able to seamlessly incorporate any machines running the database on cloud providers with on-premises hardware that house the same database. To the tool, machines in the cloud should appear the same as those run on the enterprise’s IT infrastructure.   

#2: How Scalable Is It?

One of the biggest draws of the hybrid cloud model is the ability to scale quickly. The aim is to avoid compute resources sitting idle and so capacity must be able to expand or shrink based on either current or forecasted demand. 

However, predictably scaling a database across internal IT resources and external cloud services is not an easy task. It’s important to understand how databases scale in the cloud, so that performance and scalability requirements can be met. While all databases should be able to scale initially, not all databases are able to scale in a predictable fashion.

Again, the architecture of any database plays a key role in how scalable it will be across private data centres and cloud providers. For some databases, scale is not linear so the amount of resources needed to cope with an increase in service requests would be much higher. For masterless database architectures, linear scalability should be available via node increases for both read and write operations.

Any database used in a hybrid cloud deployment should deliver predictable scalability. Master-slave architectures are less likely to deliver the same predictability when it comes to scale. For hybrid cloud deployments, where investment in cloud services may be required to deliver results back to the business, this lack of control over costs can affect both the business case and any potential return on investment too.

#3: How Secure Is The Data?

Security continues to be a big consideration for hybrid cloud deployments. Handing over any responsibility for the data can be a big hurdle to overcome, particularly if there is a perception of increased risk around unauthorised access. Other common concerns include account compromise, cloud malware, excessive data exposure and over-exposed personally identifiable information (PII).

To alleviate these worries, it’s worth looking at the security management and support tools that exist around the database and whether these can run across multiple locations. This should ensure the same levels of protection and security for data no matter where it’s housed.

As a base, encryption should be used for all data transferred over the wire, between nodes, and at rest. Similarly, authentication and access authorisation should be in place for access to data across all sites. Lastly, smart auditing functions should be applied so that database access can be monitored from both the cloud and internal locations. All of these security controls should function uniformly across the hybrid cloud deployment.   

Alongside security management, there is also the issue of data sovereignty to consider. This covers how stored data is subject to the laws of the country in which it is created and then located. Many of the issues surrounding data sovereignty deal with enforcing privacy regulations and preventing access to data that is stored in a data centre that is located in a different country.

Cloud and hybrid cloud computing services can work across traditional geopolitical barriers while companies can make use of multiple providers to deliver a service. With so many options available for hosting data, company IT teams have to consider which data regulations are applicable to their operations.

There have been a lot of developments around the regulations on data residency, particularly in Europe. The death of the Safe Harbour agreement has come at the same time as a new regulation on data protection in the EU. Under the terms of the European Union’s General Data Protection Regulation (GDPR), companies found breaching these rules could be fined as much as four per cent of their annual turnover. All companies doing business in Europe have two years to put in safeguards and management processes in place to protect customer data, as well as maintain adequate controls over how that data is processed over time.

Adhering to data sovereignty security requirements in a Hybrid Cloud deployment comes down to the granularity of control over time. For example, any database deployed under an application should retain data for specific countries or geographies in the appropriate location, while other data can freely move between clouds that exist in other geographies.

The rules should then be applied across the hybrid cloud automatically so that customer data always remains in the appropriate location or locations over time. Replication of data across database nodes for disaster protection should itself be location-aware – for example, in Europe, data can be located with nodes in Germany and France, rather than sending secondary copies of transactions or data to nodes held in other countries.

Replication in fully-distributed databases like Cassandra can be controlled at the keyspace level. This allows data covered by data sovereignty requirements to be easily restricted to local data centres, while other data can be placed into different keyspaces that allows replication between local centres and other chosen cloud providers. 

[easy-tweet tweet=”#Data #security requirements in hybrid #cloud deployments come down to granularity of control over time”]

Because the database is the heart of nearly every application, it’s important to ensure any database being considered for a hybrid cloud deployment is simple to operate in such an environments, can scale in a predictable fashion, and ensures solid data security.

Previous articleQ&A with Khaos Control
Next articleE-commerce: Speed and scalability is everything
Robin Schumacher, Vice President of Products, DataStax
Robin has spent the last 20 years working with enterprise databases and leading product management teams. He started and led the product management team at MySQL for three years before they were bought by Sun (the largest open source acquisition in history), and then by Oracle. He also started and led the product management team at Embarcadero Technologies, which was the #1 IPO in 2000. Robin is the author of three database performance books and frequent speaker at industry events. Robin holds BS, MA, and Ph.D. degrees from various universities.