How to fetch Random Samples in Google BigQuery

RAND() function can be used to generate a (seemingly) random number. You can use it to fetch a sample of 10 words from the Shakespeare dataset using:

\n\n
#standardSQL\nSELECT word\nFROM `publicdata.samples.shakespeare`\nWHERE RAND() < 10/164656\n
\n\n

or even better:

\n\n
#standardSQL\nSELECT word\nFROM `publicdata.samples.shakespeare`\nWHERE RAND() < 10/(SELECT COUNT(*) FROM `publicdata.samples.shakespeare`)\n

(where 10 is the approximate number of results I want to get, and 164656 the number of rows that table has)

Atori

posted on

Enjoy great content like this and a lot more !

Signup for a free account to write a post / comment / upvote posts. Its simple and takes less than 5 seconds