Bigquery Error: Table too large for JOIN. Consider using JOIN EACH
Problem:
I'm trying to run a query in BigQuery which has joins and a Group By Clause, It throws the error:
\n\nTable too large for JOIN. Consider using JOIN EACH. For more details, \nplease see https://developers.google.com/bigquery/docs/query-reference#joins
\n\nBelow is the problematic SQL,
select\nt1.device_uuid,\nt1.session_uuid,\nt1.nth,\nt1.Diamonds_Launch,\nt2.Diamonds_Close\nfrom (\n select\n device_uuid,\n session_uuid,\n nth,\n sum(cast([project_id].[table_id].attributes.Value as integer)) as Diamonds_Launch\n from [project_id].[table_id]\n where name = 'App Launch'\n and attributes.Name = 'Inventory - Diamonds'\n group by device_uuid, session_uuid, nth\n ) as t1\njoin each (\n select\n device_uuid,\n session_uuid,\n nth,\n sum(cast([project_id].[table_id].attributes.Value as integer)) as Diamonds_Close\n from [project_id].[table_id]\n where name = 'App Close'\n and attributes.Name = 'Inventory - Diamonds'\n group by device_uuid, session_uuid, nth\n ) as t2\non t1.device_uuid = t2.device_uuid\nand t1.session_uuid = t2.session_uuid\n
Solution:
The Quey has got a GROUP BY
inside a JOIN EACH
. "GROUP BY"
hits limits with cardinality (the number of distinct values) and the final grouping is not parallelizable. This limits BigQuery's ability to do the join.
If you change the GROUP BY
to GROUP EACH BY
, this will most likely work
dan-irving
posted onEnjoy great content like this and a lot more !
Signup for a free account to write a post / comment / upvote posts. Its simple and takes less than 5 seconds
Post Comment