Home » Bigquery » Tutorials » Create external table

BigQuery Create external table

External tables let BigQuery query data that is stored outside of BigQuery storage

Create external table example

    CREATE EXTERNAL TABLE dataset.CsvTable OPTIONS (
        format = 'CSV',
        uris = ['gs://bucket/path1.csv']
      );

External table with pipe delimiter

    CREATE OR REPLACE EXTERNAL TABLE dataset.CsvTable
    (
        x INT64,
        y STRING
    )
    OPTIONS (
    format = 'CSV',
    uris = ['gs://bucket/path1.csv'],
    field_delimiter = '|',
    max_bad_records = 5
    );

External table with autodetect partitions

The external path is gs://bucket/path/field_1=first/field_2=1/data.csv, the partition columns would be field_1 (STRING) and field_2 (INT64).

    CREATE EXTERNAL TABLE dataset.AutoHivePartitionedTable
    WITH PARTITION COLUMNS
    OPTIONS (
      uris=['gs://bucket/path/*'],
      format=csv,
      hive_partition_uri_prefix='gs://bucket/path'
    );

External table with declared partitions

The following example creates an externally partitioned table by explicitly specifying the partition columns. The external file path has the pattern gs://bucket/path/field_1=first/field_2=1/data.csv

    CREATE EXTERNAL TABLE dataset.CustomHivePartitionedTable
    WITH PARTITION COLUMNS (
      field_1 STRING, -- column order must match the external path
      field_2 INT64
    )
    OPTIONS (
      uris=['gs://bucket/path/*'],
      format=csv,
      hive_partition_uri_prefix='gs://bucket/path'
    );

Explanation

The External table has the following options

Options
allow_jagged_rows BOOL If true, allow rows that are missing trailing optional columns. Applies to CSV data.
allow_quoted_newlines BOOL If true, allow quoted data sections that contain newline characters in the file. Applies to CSV data.
compression STRING The compression type of the data source. Supported values include: GZIP. If not specified, the data source is uncompressed. Applies to CSV and JSON data.
description STRING A description of this table.
enable_logical_types BOOL If true, convert Avro logical types into their corresponding SQL types. For more information, see Logical types. Applies to Avro data.
encoding STRING The character encoding of the data. Supported values include: UTF8 (or UTF-8), ISO_8859_1 (or ISO-8859-1). Applies to CSV data.
expiration_timestamp TIMESTAMP The time when this table expires. If not specified, the table does not expire. Example: "2025-01-01 00:00:00 UTC".
field_delimiter STRING The separator for fields in a CSV file. Applies to CSV data.
format STRING The format of the external data. Supported values include: AVRO, CSV, DATASTORE_BACKUP, GOOGLE_SHEETS, NEWLINE_DELIMITED_JSON (or JSON), ORC, PARQUET. The value JSON is equivalent to NEWLINE_DELIMITED_JSON.
decimal_target_types ARRAY Determines how to convert a Decimal type. Equivalent to ExternalDataConfiguration.decimal_target_types Example: ["NUMERIC", "BIGNUMERIC"].
json_extension STRING For JSON data, indicates a particular JSON interchange format. If not specified, BigQuery reads the data as generic JSON records. Supported values include: GEOJSON (Preview). GeoJSON data. For more information, see Loading GeoJSON data.
hive_partition_uri_prefix STRING A common prefix for all source URIs before the partition key encoding begins. Applies only to hive-partitioned external tables. Applies to Avro, CSV, JSON, Parquet, and ORC data. Example: "gs://bucket/path".
ignore_unknown_values BOOL If true, ignore extra values that are not represented in the table schema, without returning an error. Applies to CSV and JSON data.
max_bad_records INT64 The maximum number of bad records to ignore when reading the data. Applies to: CSV, JSON, and Sheets data.
null_marker STRING The string that represents NULL values in a CSV file. Applies to CSV data.
projection_fields STRING A list of entity properties to load. Applies to Datastore data.
quote STRING The string used to quote data sections in a CSV file. If your data contains quoted newline characters, also set the allow_quoted_newlines property to true. Applies to CSV data.
require_hive_partition_filter BOOL If true, all queries over this table require a partition filter that can be used to eliminate partitions when reading data. Applies only to hive-partitioned external tables. Applies to Avro, CSV, JSON, Parquet, and ORC data.
sheet_range STRING Range of a Sheets spreadsheet to query from. Applies to Sheets data. Example: ?sheet1!A1:B20?,
skip_leading_rows INT64 The number of rows at the top of a file to skip when reading the data. Applies to CSV and Sheets data.
uris ARRAY An array of fully qualified URIs for the external data locations. Example: ["gs://bucket/path/*"].

Syntax reference

CREATE [OR REPLACE] EXTERNAL TABLE [IF NOT EXISTS] [[project_name.]dataset_name.]table_name
[(
  column_name column_schema,
  ...
)]

[WITH PARTITION COLUMNS
  [(
      partition_column_name partition_column_type,
      ...
  )]
]
OPTIONS (
  external_table_option_list,
  ...
);