Databricks Community

matty_f · ‎07-10-2023

I'm working on a python package that can be installed via pip. The package will manage a delta table for the user, and new versions of the package may need to run migrations on this table

Is this an okay format to use?

def migrate(table_path):
    mm_path = f"{table_path}/_migration_marker"
    marker = 0

    try:
        marker = int(dbutils.fs.head(mm_path, 10))
    except Exception as e:
        if "java.io.FileNotFoundException" in str(e):
            print("Migration marker not found")
        else:
            raise e

    if marker < 1:
        sql = f"""CREATE TABLE catalog.schema.tablename ... LOCATION {table_path}"""
        spark.sql(sql).count()
        marker = 1
        dbutils.fs.put(mm_path, marker, True)

    if marker < 2:
        sql = f"""ALTER TABLE catalog.schema.tablename ..."""
        spark.sql(sql).count()
        marker = 2
        dbutils.fs.put(mm_path, marker, True)