-by Eyal Estrin, Cloud Architect, Inter-University Computation Center (IUCC)

One of the important tools at our disposal for ensuring that confidentiality of data can be maintained is Encryption. This post discusses encryption and related topics such as Hashing and Tokenization.

Encryption is the process of encoding a message or information in such a way that only authorized parties can access it and those who are not authorized cannot. Cryptography is about constructing and analyzing protocols that prevent third parties or the public from reading private messages. Encryption includes the source message (also known as clear-text), the encryption algorithm (see examples below) and an encryption key which when applied to the source message, generate the ciphertext.

There are many different encryption algorithms which generally fall into two main categories – Symmetric and Asymmetric.

__Symmetric Encryption__

When using symmetric encryption, both the sender and receiver use the same encryption key for both encryption (from clear-text to ciphertext) and decryption process (from ciphertext to the original clear-text).

Advantages of using symmetric encryption:

- Offers confidentiality
- Offers integrity of data
- Speed of encrypt/decrypt process
- Less computer resources required for encrypt/decrypt process

Disadvantages of using symmetric encryption:

- Symmetric encryption cannot be used for non-repudiation purpose
- Need to secure the channel for exchanging the encryption key between the sender and the receiver
- Need to manage a lot of encryption keys (new key for each communication channel between a sender and a receiver)

Common use case for symmetric encryption:

- A teacher and a student exchange emails regarding sensitive research data (with healthcare information). The entire communication between them is encrypted using SSL/TLS protocol

The following are examples of common symmetric algorithms:

- AES (Advanced Encryption Standard)

https://en.wikipedia.org/wiki/Advanced_Encryption_Standard

- 3DES (Triple DES)

https://en.wikipedia.org/wiki/Triple_DES

__Asymmetric encryption__

With asymmetric encryption (also known as public key encryption – PKI), public keys are used to encrypt messages and private keys to decrypt messages.

Advantages of using asymmetric encryption:

- Offers message authentication using digital signatures
- Allows non-repudiation
- Detect tampering using digital signatures

Disadvantages of using asymmetric encryption:

- Slow process for encryption/decryption
- Its public keys are not authenticated
- Losing the private key will prevent the message from being decrypted

Common use case for asymmetric encryption:

- A student submits homework through a university-secured web site. The student is using the teacher’s public key in order to encrypt the document. The teacher uses his or her private key in order to decrypt the homework and validate that the document submitted by the student was not changed/altered

The following are examples of common asymmetric algorithms:

- RSA (Rivest, Shamir, Adleman)

https://en.wikipedia.org/wiki/RSA_(cryptosystem)

- ECC (Elliptic-curve cryptography)

https://en.wikipedia.org/wiki/Elliptic-curve_cryptography

__Cryptographic Hash Function__

This is a one-way mathematical function which takes a string of any size and produces a bit string of a fixed size. Cryptographic hash functions are commonly used to store passwords in an irreversible form and identify files using checksum.

Common use case of hash:

- A genomic database stores genome sequence as hash table, for easier searching after a specific genome sequence

The following are examples of common hash functions: SHA-2 (Secure Hash Algorithm 2)

https://en.wikipedia.org/wiki/SHA-2

- SHA-3 (Secure Hash Algorithm 3)

https://en.wikipedia.org/wiki/SHA-3

__Tokenization__

Tokenization is an alternative to encryption for protecting the confidentiality of data. Tokenization substitutes sensitive information with non-sensitive information. Tokenization is commonly used to store credit card information in a database or storing PII (such as social security numbers)

Common use case for tokenization:

- A student logs on to the university website and uses his credit card number to pay for his studies in the coming semester. His credit card numbers are stored in the university database using tokenized form in-order to avoid breach of sensitive data. The tokenized numbers are then compared against the credit card company to make sure the credit details are valid in-order to complete the payment process.

References:

https://en.wikipedia.org/wiki/Encryption

https://en.wikipedia.org/wiki/Cryptography

https://en.wikipedia.org/wiki/Cryptographic_hash_function

https://en.wikipedia.org/wiki/Tokenization_(data_security)