How to find DISTINCT values in a ARRAY in Bigquery
Often, we have array datatypes and want to remove the duplicates (often called as dedup) and find out the unique values in an array in Bigquery, like shown in the example below:
Solution:
Much cleaner way is to use SQL UDF to encapsulate dedup logic as in below example and reuse it whenever needed:
#Declare the function once
#standardSQL
CREATE TEMP FUNCTION dedup(val ANY TYPE) AS ((
SELECT ARRAY_AGG(t)
FROM (SELECT DISTINCT * FROM UNNEST(val) v) t
));
And call the function for all the array columns that you want to perform dedup:
SELECT * REPLACE(
dedup(country) AS country,
dedup(product) AS product
)
FROM `project.dataset.table`
nVector
posted on 05 Jun 20Enjoy great content like this and a lot more !
Signup for a free account to write a post / comment / upvote posts. Its simple and takes less than 5 seconds
Post Comment