excavator-matt
Contributor III

Also, I forgot to mention the workaround solution for the first approach. If you write to parquet in a volume, you can then convert it back to a Delta table in a later cell.

Instead of this

projects_pdf.to_delta("europe_prod_catalog.ad_hoc.project_recommendation_stage", mode="overwrite")

You do this

       # Avoid datetime64 timestamps error

def convert_datetime_columns_to_str(df😞
for col in df.columns:
if pd.api.types.is_datetime64_any_dtype(df[col]):
df[col] = df[col].astype(str)
return df

projects_pdf_fixed = convert_datetime_columns_to_str(projects_pdf)
projects_pdf_fixed.to_parquet("/Volumes/europe_prod_catalog/ad_hoc/temp/project_recommendation_embedding.parquet")
 
Then in next cell you do
project_embedded_df = spark.read.parquet("/Volumes/europe_prod_catalog/ad_hoc/temp/project_recommendation_embedding.parquet")
project_embedded_df.write.mode("overwrite").saveAsTable("europe_prod_catalog.ad_hoc.project_recommendation_embedding")
 

View solution in original post