-by Eyal Estrin, Cloud Architect, Inter-University Computation Center (IUCC)
Additional fundamental Cloud Security concepts can be found here.

The goal of this post is to present a common case study for building a research environment in Google Cloud Platform (GCP).

Building an environment in the cloud involves several topics we need to take under consideration (such as how do I access resources in the cloud, where and how do I store data in the cloud, how do I protect the infrastructure, etc.)

Let’s consider the following architecture:

  • Researchers will connect to the cloud environment remotely over the internet and connect to a Linux machine with data analytics tools
  • Original data sets will be stored using file storage
  • Output data will be processed in a MySQL database
  • Due to data sensitivity, data must be protected at all times

In the following sections we will break-down the research team requirements to best practices using built-in GCP services:

Infrastructure

  • For the base OS image, we will use the most up-to-date Deep Learning VM Image, which uses Debian Linux and includes the latest security patches
  • After deploying the VM, we will install the latest build of our analytics tools and development interpreters (such as Python)

Network connectivity

  • Secure access to the cloud environment remotely will be done by deploying OpenVPN from the Google marketplace and using OpenVPN clients
  • All resources will be located in a single Google Cloud VPC, but the Linux VM and the MySQL database will be located in separate subnets
  • The Linux VM will be located in a DMZ subnet, and access to this subnet will be protected using GCP firewall rules, for VPN authenticated clients on port 22 TCP
  • The database will be located in DB subnet, and access to this subnet will be protected using GCP firewall rules, with access to Cloud SQL port from the DMZ subnet only
  • Further explanation about GCP firewall rules can be found here: https://cloud.google.com/vpc/docs/using-firewalls#creating_firewall_rules

Database

Storage

Authentication

Auditing

Summary

In the above post, I’ve explained how to use GCP services in-order to build and maintain a secured research environment, while keeping sensitive data secure and following all research requirements specified at the beginning of the post.