Google BigQuery - Data Encryption and Security 101

With the cloud, Data encryption is no more a luxury but a necessity. BigQuery encrypts all your data at rest, encrypts all your data in transit. This tutorial will take a deep dive into BigQuery's encryption features

Data Security Features in BigQuery

  • Automatic data encryption by BigQuery using Google-managed keys.
  • All communication and data transfer between clients and the server protected through TLS
  • You can choose the geographical location where your data is stored, based on your cloud region
  • Deployment inside a cloud platform VPC.
  • Granular permission control using IAM (Identity and access management)

What is End-to-End Encryption

End-to-end encryption (E2EE) means the data is encrypted as it leaves the user till it gets loaded and vice versa. In Bigquery, this means that only a customer and the runtime components can read the data. No third parties, including Bigquery's cloud computing platform or any ISP, can see data in the clear

1. Data encryption at rest (i.e. on-disk encryption)

By default, All data is encrypted using AES256 algorithm and stored on disk. Each BigQuery object's data and metadata is encrypted under the Advanced Encryption Standard (AES). You can choose to let Google manage your KEKs (Key encryption key, which we will cover in the next tutorial) or upload your own KEKs.

Even the transient data (eg. the stage files) that you stage momentarily in GCS buckets before loading to BigQuery are encrypted using the GCS encryption keys

2. Data encryption at transit (i.e. while querying / loading from on-premise)

Any data that you query from Bigquery is encrypted using the HTTPS TLS security and then delivered to your Laptop. To protect your data as it travels over the Internet during read and write operations, Google Cloud uses Transport Layer Security (TLS)

3. Client-Side encryption

In addition to the standard encryption framework from Google, You can choose to add another layer of encryption called the client-side encryption, where you explicitly encrypt the data using your own KMS keys on-premise. In this method you also take the burden of maintaining encryption and decryption as Google will not be involved in this custom encryption

Global BigQuery Dataset encryption or individual table encryption

You can choose to encrypt the entire dataset using the same encryption keyring or decide to use different keys for different tables. While creating dataset, you can specify the encryption keys are shown below:

NOTE: Encryption is a default feature. There is no additional cost involved for using encryption