โ10-13-2021 05:58 PM
Whenever my cluster is terminated, I lose my whole database(I'm not sure if it's related, I made those database with delta format. ) And since the cluster is terminated in 2 hours from not using it, I wake up with no database every morning.
I don't want to run code every morning to make whole database again.
Is there any way that I can preserve my database?
I tried to clone cluster, but it didn't make my database back again. Also, I tried to restart the cluster, but it wasn't able to restart it.
โ10-14-2021 05:28 AM
do you happen to use the Community Edition? As apparently there are limitiations concerning own databases.
(https://community.databricks.com/s/feed/0D53f00001HKI7ACAX)
โ10-14-2021 12:58 AM
Please check where on dbfs database/tables are created. Please check file system is files still there.
Sharing some code regarding creating database and tables could be useful.
โ10-14-2021 01:27 AM
Hello, HubertDudek!
Thank you for help and advices!
This is where I think that my database/tables are located:
dbfs:/user/hive/warehouse/db_name.db/table_name/
This is the code that I use to create database:
%sql
CREATE DATABASE IF NOT EXISTS database_name;
USE database_name;
And this is the code that I use to create table:
(df.write
.format('delta')
.mode('overwrite')
.saveAsTable(table_name))
โ10-14-2021 05:28 AM
do you happen to use the Community Edition? As apparently there are limitiations concerning own databases.
(https://community.databricks.com/s/feed/0D53f00001HKI7ACAX)
โ10-14-2021 05:32 AM
Ahhh yes! I am using community edition! Now I figured that was the reason why! Thank you for helping me
โ05-02-2023 08:57 AM
So how to work around this? I am a student working on an assignment and I need to finish it, but two hours is not enough time!
โ05-03-2023 01:54 AM
ok how about this: download your files from dbfs to your computer:
This is not ideal but at least you do not lose your data. When you want to work further on the downloaded files you can upload them again using the UI.
When finished download etc.
Create a table on the files (which is very easy) and you are good to go.
โ05-02-2023 11:07 AM
Once if the culstur gets terminated info will be lost
โ06-03-2024 11:25 AM
As the file still in the dbfs you can just recreate the reference of your tables and continue the work, with something like this:
db_name = "mydb"
from pathlib import Path
path_db = f"dbfs:/user/hive/warehouse/{db_name}.db/"
tables_dirs = dbutils.fs.ls(path_db)
for d in tables_dirs:
table_name = Path(d.path).name
spark.sql(f"""CREATE TABLE IF NOT EXISTS {table_name}
LOCATION '{d.path}'
""")
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group