BigQuery Create external table
External tables let BigQuery query data that is stored outside of BigQuery storage
Create external table example
CREATE EXTERNAL TABLE dataset.CsvTable OPTIONS ( format = 'CSV', uris = ['gs://bucket/path1.csv'] );
External table with pipe delimiter
CREATE OR REPLACE EXTERNAL TABLE dataset.CsvTable ( x INT64, y STRING ) OPTIONS ( format = 'CSV', uris = ['gs://bucket/path1.csv'], field_delimiter = '|', max_bad_records = 5 );
External table with autodetect partitions
The external path is gs://bucket/path/field_1=first/field_2=1/data.csv, the partition columns would be field_1 (STRING) and field_2 (INT64).
CREATE EXTERNAL TABLE dataset.AutoHivePartitionedTable WITH PARTITION COLUMNS OPTIONS ( uris=['gs://bucket/path/*'], format=csv, hive_partition_uri_prefix='gs://bucket/path' );
External table with declared partitions
The following example creates an externally partitioned table by explicitly specifying the partition columns. The external file path has the pattern gs://bucket/path/field_1=first/field_2=1/data.csv
CREATE EXTERNAL TABLE dataset.CustomHivePartitionedTable WITH PARTITION COLUMNS ( field_1 STRING, -- column order must match the external path field_2 INT64 ) OPTIONS ( uris=['gs://bucket/path/*'], format=csv, hive_partition_uri_prefix='gs://bucket/path' );
Explanation
The External table has the following options
Options | |
---|---|
allow_jagged_rows | BOOL If true, allow rows that are missing trailing optional columns. Applies to CSV data. |
allow_quoted_newlines | BOOL If true, allow quoted data sections that contain newline characters in the file. Applies to CSV data. |
compression | STRING The compression type of the data source. Supported values include: GZIP. If not specified, the data source is uncompressed. Applies to CSV and JSON data. |
description | STRING A description of this table. |
enable_logical_types | BOOL If true, convert Avro logical types into their corresponding SQL types. For more information, see Logical types. Applies to Avro data. |
encoding | STRING The character encoding of the data. Supported values include: UTF8 (or UTF-8), ISO_8859_1 (or ISO-8859-1). Applies to CSV data. |
expiration_timestamp | TIMESTAMP The time when this table expires. If not specified, the table does not expire. Example: "2025-01-01 00:00:00 UTC". |
field_delimiter | STRING The separator for fields in a CSV file. Applies to CSV data. |
format | STRING The format of the external data. Supported values include: AVRO, CSV, DATASTORE_BACKUP, GOOGLE_SHEETS, NEWLINE_DELIMITED_JSON (or JSON), ORC, PARQUET. The value JSON is equivalent to NEWLINE_DELIMITED_JSON. |
decimal_target_types |
ARRAY |
json_extension | STRING For JSON data, indicates a particular JSON interchange format. If not specified, BigQuery reads the data as generic JSON records. Supported values include: GEOJSON (Preview). GeoJSON data. For more information, see Loading GeoJSON data. |
hive_partition_uri_prefix | STRING A common prefix for all source URIs before the partition key encoding begins. Applies only to hive-partitioned external tables. Applies to Avro, CSV, JSON, Parquet, and ORC data. Example: "gs://bucket/path". |
ignore_unknown_values | BOOL If true, ignore extra values that are not represented in the table schema, without returning an error. Applies to CSV and JSON data. |
max_bad_records | INT64 The maximum number of bad records to ignore when reading the data. Applies to: CSV, JSON, and Sheets data. |
null_marker | STRING The string that represents NULL values in a CSV file. Applies to CSV data. |
projection_fields | STRING A list of entity properties to load. Applies to Datastore data. |
quote | STRING The string used to quote data sections in a CSV file. If your data contains quoted newline characters, also set the allow_quoted_newlines property to true. Applies to CSV data. |
require_hive_partition_filter | BOOL If true, all queries over this table require a partition filter that can be used to eliminate partitions when reading data. Applies only to hive-partitioned external tables. Applies to Avro, CSV, JSON, Parquet, and ORC data. |
sheet_range | STRING Range of a Sheets spreadsheet to query from. Applies to Sheets data. Example: ?sheet1!A1:B20?, |
skip_leading_rows | INT64 The number of rows at the top of a file to skip when reading the data. Applies to CSV and Sheets data. |
uris |
ARRAY |
Syntax reference
CREATE [OR REPLACE] EXTERNAL TABLE [IF NOT EXISTS] [[project_name.]dataset_name.]table_name [( column_name column_schema, ... )] [WITH PARTITION COLUMNS [( partition_column_name partition_column_type, ... )] ] OPTIONS ( external_table_option_list, ... );