Databricks

elgeo · ‎02-13-2023

We have a Cursor in DB2 which reads in each loop data from 2 tables. At the end of each loop, after inserting the data to a target table, we update records related to each loop in these 2 tables before moving to the next loop. An indicative example is the below:

FETCH CUR1 INTO V_A1, V_A2, V_C1, V_C3, V_M1, V_M2

SELECT V_M1 FROM TABLE_1 WHERE A1=V_A1

SELECT V_M2 FROM TABLE_2 WHERE C1=V_C1

IF ..... THEN V_B1 = V_M1-V_M2 ELSE ....

INSERT INTO TARGET ... VALUES (V_A1, V_A2, ...)

UPDATE TABLE_1 SET V_M1 = V_M1 - V_B1

UPDATE TABLE_2 SET V_M2 = V_M2 - V_B1

FETCH CUR1 INTO V_A1, V_A2, V_C1, V_C3, V_M1, V_M2

END WHILE

CLOSE CUR1

Just to note that A1, C1 are not unique across the data.

Could you please suggest a way to transform it using Pyspark? Performace also matters as we speak about a large amount of data. I saw that RDDs are immutable in case we were trying RDD-map option.

Thank you in advance

Anonymous · ‎03-07-2023

@ELENI GEORGOUSI :

Load the data from the two tables into PySpark DataFrames: df1 and df2
Join the two DataFrames on their common columns: call it df
Define a user-defined function (UDF) that implements your IF statement
Add a new column to the DataFrame that computes using the UDF
Insert the data into the target table
Update the source tables:

Hope this helps to give you a framework on how to think and go about curser.

Anonymous · ‎04-10-2023

Hi @ELENI GEORGOUSI

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance!

Databricks

Trasform SQL Cursor using Pyspark in Databricks

🔔 Attention Databricks Academy Users: SSO Implementation Incoming! Secure Your Account Today!

Announcing the General Availability of Databricks Asset Bundles

Register now and save 50% on training at Data + AI Summit!

How to successfully build GenAI applications

Meet DBRX, the New Standard for High-Quality LLMs