Bigquery Error: Table too large for JOIN. Consider using JOIN EACH
Problem:
I'm trying to run a query in BigQuery which has joins and a Group By Clause, It throws the error:
Table too large for JOIN. Consider using JOIN EACH. For more details,
please see https://developers.google.com/bigquery/docs/query-reference#joins
Below is the problematic SQL,
select
t1.device_uuid,
t1.session_uuid,
t1.nth,
t1.Diamonds_Launch,
t2.Diamonds_Close
from (
select
device_uuid,
session_uuid,
nth,
sum(cast([project_id].[table_id].attributes.Value as integer)) as Diamonds_Launch
from [project_id].[table_id]
where name = 'App Launch'
and attributes.Name = 'Inventory - Diamonds'
group by device_uuid, session_uuid, nth
) as t1
join each (
select
device_uuid,
session_uuid,
nth,
sum(cast([project_id].[table_id].attributes.Value as integer)) as Diamonds_Close
from [project_id].[table_id]
where name = 'App Close'
and attributes.Name = 'Inventory - Diamonds'
group by device_uuid, session_uuid, nth
) as t2
on t1.device_uuid = t2.device_uuid
and t1.session_uuid = t2.session_uuid
Solution:
The Quey has got a GROUP BY
inside a JOIN EACH
. "GROUP BY"
hits limits with cardinality (the number of distinct values) and the final grouping is not parallelizable. This limits BigQuery's ability to do the join.
If you change the GROUP BY
to GROUP EACH BY
, this will most likely work
dan-irving
posted on 10 Sep 18Enjoy great content like this and a lot more !
Signup for a free account to write a post / comment / upvote posts. Its simple and takes less than 5 seconds
Post Comment