Solved: Bigquery "Invalid schema update. Field version has changed type" error when loading data using autodetect

To troubleshoot this error, first we need to understand how autodetect works

How schema autodetect works in Bigquery:

Autodetect will scan up to 100 rows to detect the data type. If for all 100 rows, the data is like "1.12", then very likely this field is a float value. If one of the row has value "1.12.0", then BigQuery will detect the type is string. So, if the initial file and the subsequent file is not in sync, you may end up with this error

Also if Bigquery finds any unsupported characters in the column name, it will replace it with "_" Underscores.

For Example:

Let's say I have a table with one single field named "version", which is a string. 

  • When you try to load data into the table using autodetect with values like "1.1" or "1", the autodetect feature infers these values as float or integer type respectively
  • When you try to load "1.11.0", Bigquery assigns a STRING datatype to the column. In the day 2 load if you receive a new file with value "1.10", Bigquery autodetect will now declare this column as FLOAT and while inserting into the table it will fail. Because in Day-1 the column was created as STRING and in Day-2 we are trying to insert a FLOAT value (Error: Invalid schema update. Field version has changed type from STRING to FLOAT)

Solution:

To solve this error, Do not use auto-detect, always pre-create your tables wherever you may face this inconsistency

Mike-Barn

posted on 09 Mar 19

Enjoy great content like this and a lot more !

Signup for a free account to write a post / comment / upvote posts. Its simple and takes less than 5 seconds