Databricks

Kaniz · ‎06-08-2022

I first set up a delta live table using Python as follows.

@dlt.table
def transaction():
  return (
    spark
    .readStream
    .format("cloudFiles")
    .schema(transaction_schema)
    .option("cloudFiles.format", "parquet")
    .load(path)
  )

And I wrote the delta live table for the target database test.

{
    "id": <id>,
    "clusters": [
        {
            "label": "default",
            "autoscale": {
                "min_workers": 1,
                "max_workers": 5
            }
        }
    ],
    "development": true,
    "continuous": false,
    "edition": "core",
    "photon": false,
    "libraries": [
        {
            "notebook": {
                "path": <path>
            }
        }
    ],
    "name": "dev pipeline",
    "storage": <storage>,
    "target": "test"
}

Everything worked as expected in the first trial.

After a while, I noticed that I forgot to add a partition column to the table, so I dropped the table in test by DROP TABLE test.transaction, and updated the notebook to

@dlt.table(
  partition_cols=["partition"],
)
def transaction():
  return (
    spark
    .readStream
    .format("cloudFiles")
    .schema(transaction_schema)
    .option("cloudFiles.format", "parquet")
    .load(path)
    .withColumn("partition", F.to_date("timestamp"))
  )

However, when I reran the pipeline, I got an error.

org.apache.spark.sql.AnalysisException: Cannot change partition columns for table transaction.
Current: 
Requested: partition

I can't change the partition column by only dropping the target table.

What is the proper way to change partition columns in delta live tables?

CC :- @Kit Yam Tse

Computer Science • Hong Kong University of Science and Technology

RiyazAli · ‎06-08-2022

@Kaniz Fatma - is the error because of the partition column being created rather than using predefined column?

I'm intrigued to know the flow of execution of the dlt script written above. So as I see, once the readStream creates a df with a new column named partition then DLT would be created along with this partition?