How to partition by month/quarter/year to get over the partition limit in Bigquery ?

Bigquery allows you to create partitioned tables. And partitioned tables can have a maximum of 4000 partitions per table. If you exceed that limit bigquery throws the below error message:

Too many partitions produced by query, allowed 4000, query produces at \nleast 12384 partitions

Solution

Method #1:

Try partitioning based on month / quarter / year. And use cluster keys to prune the data efficiently.

CREATE TABLE mydataset.newtable 
(
transaction_id INT64, 
transaction_date DATE\n)
PARTITION BY DATE_TRUNC(transaction_date, MONTH);
\n\n

In case of timestamp column:

CREATE TABLE mydataset.newtable 
(
transaction_id INT64,
transaction_ts TIMESTAMP
)
PARTITION BY TIMESTAMP_TRUNC(transaction_ts, MONTH);

Method #2: You can also consider partitioning using a manufactured column

\n\n
CREATE TABLE `dataset.tablename`\nPARTITION BY FlightDate_year\nCLUSTER BY Origin, Dest \nAS\nSELECT *, DATE_TRUNC(FlightDate, YEAR) FlightDate_year\nFROM `flights.raw_load_fixed`

DataFreak

posted on

Enjoy great content like this and a lot more !

Signup for a free account to write a post / comment / upvote posts. Its simple and takes less than 5 seconds