Fundamental Cloud Security Part 13 – Case Study: Building a Secure Research Environment in AWS

The goal of this post is to present a common case study for building a research environment in AWS.

Building an environment in the cloud involves several topics we need to take under consideration (such as how do I access resources in the cloud, where and how do I store data in the cloud, how do I protect the infrastructure, etc.)

Let’s consider the following architecture:

Researchers will connect to the cloud environment remotely over the internet and connect to a Linux machine with data analytics tools.
Original data sets will be stored using object storage.
Output data will be processed in a MySQL database.
Due to data sensitivity, data must be protected at all times.

In the following sections we will break-down the research team requirements to best practices using built-in AWS services:

Infrastructure

For the base OS image, we will use the most up-to-date Linux AMI, which contains the latest security patches.
After deploying the VM, we will install the latest build of our analytics tools and development interpreters (such as Python).
Once the image is fully installed, we will deploy Amazon Inspector, and in-order to make sure the image is being assessed for security vulnerabilities on a regular basis.

Example of a possible solution can be seen here: https://aws.amazon.com/blogs/security/how-to-set-up-continuous-golden-ami-vulnerability-assessments-with-amazon-inspector/

In-order to deploy security patches on the Linux machine, we will deploy AWS Patch Manager agent.

Network connectivity

Access to the cloud environment remotely will be done using AWS Client VPN.
All resources will be located in a single Amazon VPC, but the Linux VM and the MySQL database will be located in separate subnets.
The Linux VM will be located in a DMZ subnet, and access to this subnet will be protected using Amazon security groups, for VPN authenticated clients on port 22 TCP.
The database will be located in DB subnet, and access to this subnet will be protected using a DB security group, with access to MySQL port from the DMZ subnet only.
Further explanation about security groups can be found here: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Overview.RDSSecurityGroups.html

Database

The MySQL database will be deployed as a managed service using Amazon RDS.
Access privileges to the MySQL database will be restricted using Amazon IAM roles.
The traffic between the Linux machine and the MySQL database will be encrypted using TLS, as explained here: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.SSL.html
Data inside the MySQL database will be encrypted at rest, and the encryption keys will be stored on Amazon KMS, as explained here: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Overview.Encryption.html

Storage

Data will be stored in Amazon S3 , in a private bucket, as explained here: https://aws.amazon.com/blogs/security/how-to-use-bucket-policies-and-apply-defense-in-depth-to-help-secure-your-amazon-s3-data/
Access to the S3 bucket will be restricted using an IAM policy, as explained here: https://docs.aws.amazon.com/AmazonS3/latest/dev/example-policies-s3.html
Data inside the S3 bucket will be encrypted at rest, as explained here: https://docs.aws.amazon.com/AmazonS3/latest/dev/ServerSideEncryptionCustomerKeys.html

Authentication

Access using SSH to login to the Linux VM will be performed using the AWS Directory service, as explained here: https://aws.amazon.com/answers/security/aws-controlling-os-access-to-ec2/

Auditing

Access to all resources will be audited for further review using Amazon CloudTrail, as explained here: https://docs.aws.amazon.com/awscloudtrail/latest/userguide/best-practices-security.html
Alerts for suspicious activity will raise alarm using Amazon CloudWatch, as explained on https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AlarmThatSendsEmail.html

Summary

In this post, I’ve explained how to use AWS services in-order to build and maintain a secured research environment, while keeping sensitive data secure and following all research requirements specified at the beginning of the post.

About the author

Eyal Estrin, cloud architect.