Connection from BigQuery to Databricks populating dictionary keys as "v"

KristiLogos
Contributor

I was able to connect our Bigquery account to our Databricks catalog. However, all the keys in the nested dictionary columsn populate as 'v'. For example:

{"v":[{"v":{"f":[{"v":"engagement_time_msec"},{"v":{"f":[{"v":null},{"v":"2"},{"v":null},{"v":null}]}}]}},{"v":{"f":[{"v":"engaged_session_event"},{"v":{"f":[{"v":null},{"v":"1"},{"v":null},{"v":null}]}}]}},{"v":{"f":[{"v":"page_location"},{"v":{"f":[{"v":"https://www.logos.com/cart"},{"v":null},{"v":null},{"v":null}]}}]}},{"v":{"f":[{"v":"ignore_referrer"},{"v":{"f":[{"v":"true"},{"v":null},{"v":null},{"v":null}]}}]}},{"v":{"f":[{"v":"batch_page_id"},{"v":{"f":[{"v":null},{"v":"1735751131326"},{"v":null},{"v":null}]}}]}},{"v":{"f":[{"v":"batch_ordering_id"},{"v":{"f":[{"v":null},{"v":"3"},{"v":null},{"v":null}]}}]}},{"v":{"f":[{"v":"currency"},{"v":{"f":[{"v":"USD"},{"v":null},{"v":null},{"v":null}]}}]}},{"v":{"f":[{"v":"page_referrer"},{"v":{"f":[{"v":"https://www/subscriber"},{"v":null},{"v":null},{"v":null}]}}]}},{"v":{"f":[{"v":"page_title"},{"v":{"f":[{"v":"Shopping Cart | Logos Bible Software"},{"v":null},{"v":null},{"v":null}]}}]}},{"v":{"f":[{"v":"value"},{"v":{"f":[{"v":null},{"v":null},{"v":null},{"v":"74.87"}]}}]}},{"v":{"f":[{"v":"ga_session_id"},{"v":{"f":[{"v":null},{"v":"1735745404"},{"v":null},{"v":null}]}}]}},{"v":{"f":[{"v":"session_engaged"},{"v":{"f":[{"v":"1"},{"v":null},{"v":null},{"v":null}]}}]}},{"v":{"f":[{"v":"ga_session_number"},{"v":{"f":[{"v":null},{"v":"6"},{"v":null},{"v":null}]}}]}}]}

Walter_C
Databricks Employee
Databricks Employee

Did you use any specific guide or code you can share with me to make this connection?

@Walter_C  - I've pretty much connected via unity catalog by adding BigQuery as a new connection. I was asked and added my json key from Google Console for the specific BigQuery project when creating the connection:
https://docs.databricks.com/en/query-federation/bigquery.html

szymon_dybczak
Esteemed Contributor III

Hi @KristiLogos ,

I had the same problem as you, but in my case I was extracting ga4 data from big query using ADF. 

Try to use following code for every column that has JSON content.

select TO_JSON_STRING(items) as jsondata FROM ...

The result will contain the proper names instead of the "v" names.

KristiLogos
Contributor

@szymon_dybczak I couldn't run select TO_JSON_STRING(event_params) as event_params FROM ...
I don't think thats a built-in Databricks. Is there another way you've had success?

error:
[UNRESOLVED_ROUTINE] Cannot resolve routine `TO_JSON_STRING` on search path

szymon_dybczak
Esteemed Contributor III

Sorry @KristiLogos , I didn't notice that you're using lakehouse federation. In my case the extraction part was done in ADF, so I was able to use bigquery TO_JSON_STRING function to resolve this issue.
You can also take a look at below discussion and try to use autoflatten_optimised function. 
Or take similar approach as @AnaMocanu in that thread.

Solved: Best way to parse Google Analytics data in Databri... - Databricks Community - 66976

View solution in original post

@szymon_dybczak No worries, and thanks I'll try that out. Also, was there a reason in particular you went with ADF and not federated connection to BQ? Just curious