cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Migrating jsonb data from Postgresql database to Databricks

Karene
New Contributor

Hi Team,

I am trying to create a pipeline to incrementally ingest data from an RDS postgresql database which contains tables that have some columns of jsonb data type. I am currently using AWS DMS with CDC to first load the data into an S3 bucket as csv files, and then using Databricks Autoloader to ingest the files into a streaming delta table.

Currently, the json data is being stored as a string data type, whereas I would like it to be stored as a struct data type so that it can be queried.

What is the best way to achieve this with Autoloader so that the ingested data has the jsonb columns as struct data types? This is the code I am using to ingest the data -

 

spark.readStream.format("cloudFiles")
.option("cloudFiles.format", "csv")
.option("cloudFiles.inferSchema", "true")
.option("cloudFiles.inferColumnTypes", "true")
.load("s3://path/to/bucket")
 
Thanks in advance!
1 REPLY 1

BR_DatabricksAI
Contributor III

Hello Karene, 

You can do the transformation in following manner from string to struct and refer to the example below:

 

data =[('001','{"name":"bhupendra","zipcode":"260100"}')]
schema = ['id','propertytype']

df = spark.createDataFrame(data,schema)
df.show(truncate=False)
df.printSchema()
from pyspark.sql.functions import from_json
from pyspark.sql.types import StructType, StructField,StringType

structTypeSchema = StructType([\
    StructField('name',StringType()),\
    StructField('zipcode',StringType())])

df1 = df.withColumn('propertystructtype', from_json(df.propertytype, structTypeSchema))
df1.show(truncate=False)
df1.printSchema()

 

BR

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now