2 weeks ago
Hi Community,
Up until recently I was happily deleting deltaTables in ADLS Gen with their associated _delta_log table, and subsequently recreating the same table with a new _delta_log table.
Now, after deleting a table with its associated _delta_log table when I attempt to create a new table with the same name I get the error:
DeltaIllegalStateException: The protocol of your Delta table could not be recovered while Reconstructing version: 0. Did you manually delete files in the _delta_log directory?
Has Databricks changed something that prevents people from recreating new deltaTables with the same table name?
Can someone please let me know how to resolve this.
2 weeks ago
1. If you purposely removed _delta_log/ but kept data files.
Delete any remaining _delta_log/ completely, then convert the Parquet directory back to Delta to create a fresh log:
-- path has only Parquet data now
CONVERT TO DELTA parquet.`abfss://<container>@<account>.dfs.core.windows.net/<path>/`
-------------------------------------------------------------------------------------------------------------------------------------
2. Ensure the path is 100% empty (no hidden _delta_log):
# in a Databricks notebook
display(dbutils.fs.ls("abfss://<container>@<account>.dfs.core.windows.net/<path>"))
display(dbutils.fs.ls("abfss://<container>@<account>.dfs.core.windows.net/<path>/_delta_log"))
If anything remains, remove the entire base folder
dbutils.fs.rm("abfss://<container>@<account>.dfs.core.windows.net/<path>", recurse=True)
Then recreate the table
CREATE OR REPLACE TABLE catalog.schema.table
USING DELTA
LOCATION 'abfss://<container>@<account>.dfs.core.windows.net/<path>'
AS SELECT * FROM some_source;
2 weeks ago - last edited 2 weeks ago
Hi @Carlton ,
This could happen for following reasons:
When files are manually removed or not removed correctly, the Delta log versions become non-contiguous.
You can try to follow this guide to resolve this issue:
Also, the way you were doing in (like manually deleting files) is not recommended by Databricks.
Instead of dropping and recreating Delta tables, the official recommendation is to use CREATE OR REPLACE command.
2 weeks ago
Hello @Carlton ,
Good Day!
What is the DBR version? How are you creating the table? Did you drop the table and recreate it using SQL?
2 weeks ago
Hi Guys,
I really appreciate you getting in touch, this has been driving me crazy.
I delete the entire folder from ADLS. Admittedly, I didn't remove the folder from within Databricks using dbutils.fs.rm.
If you take a look at my image, I want to remove the 'Country'. So I would delete the Country folder and all sub-folders e.g. folder 1 and then _delta_log folder. But when attempt to create another Country deltaTable I get the error.
Are you suggesting that I should delete from within databricks?
2 weeks ago
Hi @Carlton ,
Yesโdelete from inside Databricks (and drop the table) when you plan to reuse the same path/name. External deletes can leave you with a half-cleaned path or stale state, which leads to the โprotocol could not be recovered (version: 0)โ error.
2 weeks ago
Sorry I forgot to add the image:
2 weeks ago
2 weeks ago - last edited 2 weeks ago
Hi,
Delete all folders in ADLS related to a country and also execute DROP TABLE command on the catalog where this table resides.
Probably catalog still stores metadata about this table and thinks it exists.
Then try to create table - but only if you have performed above steps
2 weeks ago
@Carlton This error means you are not passing the client id and client secret in the spark config that will authenticate against the storage or the client secret expired.
2 weeks ago
Hi szymon_dybczak, if you take a look at the image, are you suggesting I delete the folder BASE? That folder is the equivalent to our SILVER layer. Deleting that folder would be devastating to our business. I deleted from DATA and all sub-folders below DATA including Country. Shouldn't that be enough?
2 weeks ago
Hi, sorry. I should be more specific. You should delete country folder along with all subfolders ( basically the folder in which the data for the table you want to delete resides).
After that, execute DROP TABLE command on the catalog where the table you wanted to delete was registered.
2 weeks ago
Hi szymon_dybczak, no worries, I'm just grateful that you're helping me out. I did what you suggested along with deleting the table from the catalog, see image but I'm still getting the same error
I should mention that I'm getting the following error when I try list files
fs.azure.account.keyInvalid configuration value detected for fs.azure.account.key
2 weeks ago
I should also mention that not only do I get the error:
DeltaIllegalStateException: The protocol of your Delta table could not be recovered while Reconstructing version: 0. Did you manually delete files in the _delta_log directory?
The table folder that is created looks like the following:
2 weeks ago
I should also mentioned I am using the following code to create the deltaTable which I have been using for years
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now