Difference between Numeric vs Float datatype in Google Bigquery
Google bigquery offers three datatypes to store numbers / amounts - INT64, FLOAT64 and NUMERIC. As the name suggests, INT64 is for storing whole numbers. Whereas FLOAT and NUMERIC are used to stored decimal values.
What is the difference between NUMERIC and FLOAT64 ?
At first look, they both look the same. In fact, Float even takes less space to store the same values. The real difference is when doing arithmetic operations (+,-,*) or while doing comparison operations (>,<,+). Since, Float is an approximate datatype, there will be a leak and you may not find the exact value.
Why you should not use FLOAT64:
Approximate numeric data types (like FLOAT64) do not store the exact values specified for many numbers; they store an extremely close approximation of the value. For many applications, the tiny difference between the specified value and the stored approximation is not noticeable. At times, though, the difference becomes noticeable. Because of the approximate nature of the float and real data types, do not use these data types when exact numeric behavior is required, such as in financial applications, in operations involving rounding, or in equality checks. Instead, use the NUMERIC data type.
Avoid using float in WHERE clause search conditions, especially the = and <> operators. It is best to limit float and real columns to > or < comparisons.
The IEEE 754 specification provides four rounding modes: round to nearest, round up, round down, and round to zero, and it uses one of these modes to round your data.
Here's a proof:
SELECT
4.35 * 100 a_float
,CAST(4.35 AS NUMERIC) * 100 a_numeric
Results:
Row | a_float | a_numeric |
1 | 434.9999999999999994 | 435 |
In Summary, use float datatype for storing non critical information like quantity, For all critical fields, where precision matters, use NUMERIC Datatype.
Beware, Numeric datatype has its own share of pitfalls (listed below), but regardless NUMERIC is a better alternative to FLOAT64
- NUMERIC auto-detects the scale, you cannot define how many digits can appear after the dot. Its dependent on the input data. Not a big deal, but you may not have those pretty looking reports
- NUMERIC is not supported by Tableau, so, any columns defined as NUMERIC may not show up in your Tableau reports
As part of the engineering team, you have the responsibility to carefully understand these differences and choose a optimal datatypes while creating data models for your cloud data warehouse
dan-irving
posted on 11 Dec 19Enjoy great content like this and a lot more !
Signup for a free account to write a post / comment / upvote posts. Its simple and takes less than 5 seconds
Post Comment