Security Considerations for Hybrid Cloud Data Management and Storage
Modern cloud storage services have extensive security mechanisms encompassing much of what enterprises have used in on-premises storage systems.
- By Paul Speciale
- January 10, 2020
One of the hottest topics in enterprise IT today is hybrid cloud -- the combination of private (on-premises) and public cloud resources -- to deploy applications or deliver new services. A hybrid cloud can encompass application workloads, compute resources, virtual networks, and data storage and management.
These new hybrid cloud architectures increase and complicate security issues such as data corruption or loss, breaches or other attacks, and privacy. Moreover, due to regulations such as the GDPR or HIPAA, enterprises using hybrid architectures must be prepared to maintain compliance practices on public clouds operated by different entities.
Enormous advances have been made in the last decade to address cloud security concerns including multitenancy, advanced authentication, and data encryption capabilities. Most enterprises today have sufficient experience to trust cloud security, but as witnessed in recent security breaches, the safe use of cloud storage depends on the enterprises’ knowledge of (and experience with) these critical technologies.
Let’s consider the security implications of three emerging hybrid cloud data management use cases that enterprises are deploying:
- Cloud data archiving: moving data from on-premises storage to a public cloud for long-term, low-cost archival storage; examples include Amazon Glacier, Azure Cold Storage, and low-cost services such as Wasabi. Cloud data archiving can reduce costs, save space in on-premises systems, and aid in regulatory compliance.
- Cloud data bursting: replicating or moving data from on premises to a public cloud to use with a cloud-native compute service. Common services include data analytics (e.g., Amazon EMR), media transcoding, and content delivery.
- Cloud data business continuance or disaster recovery: synchronizing data from on premises to a public cloud to preserve data in the event of a failure or outage of the on-premises data center. This data replication is often combined with redundancy of the application stack to provide a full business continuance environment in the cloud.
These hybrid cloud data use cases imply the use of two storage solutions. The first is an on-premises storage system (commonly a NAS filer or a scale-out file or object storage system); the second is a public cloud storage service such as Azure Blob Storage. Data typically originates in the on-premises system (source) and is moved or replicated to the cloud storage service (target). A few leading solutions provide data replication directly to public cloud targets. The use of a cloud service as the target for these use cases eliminates the need for a second physical data center with the associated capital expenditures. Additionally, cloud-bursting means that peak resources do not need to be provisioned in the enterprise data center nor do specialized services have to be developed in-house.
Security principles encompassing authentication, access control, and data encryption have been a long-term focus for on-premises storage. For years, NAS systems have provided authentication and access control including integration with enterprise security services such as Active Directory, LDAP, and Kerberos. Similar principles apply to cloud storage -- as well as intermediate data transmission requirements for hybrid cloud use cases.
The security plan and infrastructure should be comprehensive across all layers of the stack, starting with application-level policies and patching. However, because storage is the ultimate resting place of what is often critical and sensitive data, it must be a cornerstone of the overall security shield. When it comes to designing a cloud-based storage infrastructure, there are multiple threat models to be considered in addition to choosing where to store files.
Among the most important potential threats are:
- Accidental and malicious intrusion into cloud storage: unauthorized access to storage, either by an internal user who is accidentally or maliciously browsing for sensitive data or a breach by an unknown external user.
- Network sniffing: data transmitted in the clear across public internet or internal networks that can be intercepted over the wire and accessed. There are also real concerns and risks pertaining to changes in the organization’s security infrastructure that potentially weaken it.
- Ransomware: a more nefarious step beyond malicious intrusion, data stored in cloud services or on-premises storage can be encrypted for ransom; data owners may be threatened with corruption or deletion until the owner pays a ransom.
Well-known best practices to protect against these threats exist for on-premises infrastructure, including user authentication and access control, wire-level and at-rest encryption, locking down network ports, OS hardening, and even data versioning.
A common best practice in hybrid cloud use cases is outbound data (push) requests allowed only from the on-premises system to the cloud over a secure Websocket connection or VPN to securely replicate data “outbound” to the cloud. This avoids the need for cloud services to have access into the corporate data center via holes in corporate firewalls.
Today’s cloud storage services also provide capabilities to address these threats:
- Cloud-based authentication: cloud storage services such as Amazon S3, Azure Blob, and others provide secure key-based authentication. Amazon Web Services’ method is known as Signature v4 HMAC-based authentication. Each storage “user” needs to be known within the service and is assigned a pair of keys (an access key and a private key) with every storage request “signed” with the user’s private key and authenticated by S3 using the public access key. This ensures that only known users can access storage.
- Cloud-based access control: extensive new identity and access management (IAM) capabilities are integrated into cloud storage. This allows very fine-grained “allow/deny” access control policies to be assigned to users, user groups, or storage containers (buckets in AWS S3 parlance) and the data within the containers (objects in S3). These policies ensure that even authorized users with access keys can access only the data they are specifically granted access to; access can be restricted to “read only.”
- Secure data transmission: wire-level encryption using SSL certificates with well-known HTTPS/TLS secure communications is supported by all cloud storage services to prevent data transmissions in the clear.
- Cloud-based encryption: all leading cloud services provide options for encryption-at-rest of data. In AWS, several options of server-side encryption (SSE) can be selected, with choices for where encryption keys are generated and stored. Using cloud-native encryption (instead of transmitting pre-encrypted data from the on-premises storage system) also ensures that data can be decrypted for use by compute services (required for hybrid cloud bursting and disaster recovery). Many object storage solutions today implement similar multitenancy security models to cloud-style services (and some have gone further to fully emulate AWS IAM) to provide a common experience for administration for on-premises deployments.
A Final Word
In summary, cloud storage services have extensive security mechanisms encompassing much of what enterprises have used in on-premises storage systems. Data-centric hybrid cloud use cases are emerging that let enterprises gain real advantages in time-to-market, agility, and cost -- leveraging the cloud as a target for what would traditionally require a second physical data center.