Databricks Community

Mario_D · ‎10-25-2024

I'm not sure whether this is the right place, but we've encountered a bug in the datasets.py

(https://github.com/mlflow/mlflow/blob/master/mlflow/recipes/steps/ingest/datasets.py.). Anyone using recipes beware of forementioned.

def _convert_spark_df_to_pandas(self, spark_df):
import pandas as pd

datetime_cols = [

#this should befield.name for field in spark_df.schema.fields if str(field.dataType) == "DateType()"

field.name for field in spark_df.schema.fields if str(field.dataType) == "DateType"
]
pandas_df = spark_df.toPandas()
pandas_df[datetime_cols] = pandas_df[datetime_cols].apply(pd.to_datetime, errors="coerce")

return pandas_df

stbjelcevic · ‎11-06-2025

Hi @Mario_D ,

Thanks for bringing this to our attention, I will pass this information along to the appropriate team!

Databricks Community

Bug: MLflow recipe

Congratulations Databricks Partners! You're Now Officially Recognized in the Databricks Community

Solution Accelerator Series | Measure Ad Effectiveness With Multi-Touch Attribution

Govern AI Spend at Scale: A Data-Driven Approach to AI Governance | Webinar

Databricks AMER Learning Festival | Virtual Training

Introducing the Genie Hub: Ask Questions, Share Builds, and Master Conversational Analytics