Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
Hello,I have been working on this issue as a proof of concept - it would be extremely helpful to iterate through tables via loops in a few scenarios. I have a simple three column dimension that I added to a cached table.cache lazy table hedis_cache s...
Got it to work, thank you for the tip! I needed to convert the dataframe over to a pandas dataframehttps://www.geeksforgeeks.org/convert-pyspark-dataframe-to-dictionary-in-python/
I have a delta table created by:%sql
CREATE TABLE IF NOT EXISTS dev.bronze.test_map (
id INT,
table_updates MAP<STRING, TIMESTAMP>,
CONSTRAINT test_map_pk PRIMARY KEY(id)
) USING DELTA
LOCATION "abfss://bronze@Table Path"With initi...
Hi @Mohammad Saber Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedba...
Hey everyone, I'm avoiding repeating the When Function for 12x, so I thought of the dictionary. I don't know if it's a limitation of the Spark function or a Logic error. Does the function allow this concatenation?
Hello everyone, I found this alternative to reduce repeated code.custoDF = (custoDF.withColumn('month', col('Nummes').cast('string'))
.replace(months, subset=['month']))