cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

daindana
by New Contributor III
  • 3421 Views
  • 8 replies
  • 3 kudos

Resolved! How to preserve my database when the cluster is terminated?

Whenever my cluster is terminated, I lose my whole database(I'm not sure if it's related, I made those database with delta format. ) And since the cluster is terminated in 2 hours from not using it, I wake up with no database every morning.I don't wa...

  • 3421 Views
  • 8 replies
  • 3 kudos
Latest Reply
dhpaulino
New Contributor II
  • 3 kudos

 As the file still in the dbfs you can just recreate the reference of your tables and continue the work, with something like this:db_name = "mydb" from pathlib import Path path_db = f"dbfs:/user/hive/warehouse/{db_name}.db/" tables_dirs = dbutils.fs....

  • 3 kudos
7 More Replies
Dilorom
by New Contributor
  • 4323 Views
  • 5 replies
  • 3 kudos

What is a recommended directory for creating a database with a specified path?

I was going through Data Engineering with Databricks training, and in DE 3.3L - Databases, Tables & Views Lab section, it says "Defining database directories for groups of users can greatly reduce the chances of accidental data exfiltration." I agree...

  • 4323 Views
  • 5 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Dilorom A​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we ...

  • 3 kudos
4 More Replies
Karthig
by New Contributor III
  • 18167 Views
  • 13 replies
  • 8 kudos

Error Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient - while trying to create database

Hello All,I get the org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient while trying to create a database scr...

image image.png image
  • 18167 Views
  • 13 replies
  • 8 kudos
Latest Reply
mroy
New Contributor III
  • 8 kudos

Alright, we've implemented a workaround for this, and so far it's been working very well:First, we created a reusable notebook to wait until Hive has been initialized (see code below).We then execute this notebook using the %run command at the top of...

  • 8 kudos
12 More Replies
same213
by New Contributor III
  • 3005 Views
  • 5 replies
  • 9 kudos

Is it possible to create a sqlite database and export it?

I am trying to create a sqlite database in databricks and add a few tables to it. Ultimately, I want to export this using Azure. Is this possible?

  • 3005 Views
  • 5 replies
  • 9 kudos
Latest Reply
same213
New Contributor III
  • 9 kudos

@Hubert Dudek​  We currently have a process in place that reads in a SQLite file. We recently transitioned to using Databricks. We were hoping to be able to create a SQLite file so we didn't have to alter the current process we have in place.

  • 9 kudos
4 More Replies
allan-silva
by New Contributor III
  • 2280 Views
  • 3 replies
  • 4 kudos

Resolved! Can't create database - UnsupportedFileSystemException No FileSystem for scheme "dbfs"

I'm following a class "DE 3.1 - Databases and Tables on Databricks", but it is not possible create databases due to "AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got exception: org.apache.hadoop.fs.Unsupp...

  • 2280 Views
  • 3 replies
  • 4 kudos
Latest Reply
allan-silva
New Contributor III
  • 4 kudos

A colleague from my work figured out the problem: the cluster being used wasn't configured to use DBFS when running notebooks.

  • 4 kudos
2 More Replies
venkad
by Contributor
  • 786 Views
  • 0 replies
  • 4 kudos

Default location for Schema/Database in Unity

Hello Bricksters,We organize the delta lake in multiple storage accounts. One storage account per data domain and one container per database. This helps us to isolate the resources and cost on the business domain level.Earlier, when a schema/database...

  • 786 Views
  • 0 replies
  • 4 kudos
Dicer
by Valued Contributor
  • 2143 Views
  • 5 replies
  • 5 kudos

Resolved! Azure Databricks: AnalysisException: Database 'bf' not found

I wanted to save my delta tables in my Databricks database. When I saveAsTable, there is an error message Azure Databricks: AnalysisException: Database 'bf' not found​Ye, There is no database named "bf" in my database.Here is my full code:import os i...

  • 2143 Views
  • 5 replies
  • 5 kudos
Latest Reply
Dicer
Valued Contributor
  • 5 kudos

Some data can be saved as delta tables while some cannot.

  • 5 kudos
4 More Replies
LukaszJ
by Contributor III
  • 2953 Views
  • 6 replies
  • 4 kudos

Resolved! Terraform: get metastore id without creating new metastore

Hello,I want to create database (schema) and tables in my Databricks workspace using terraform.I found this resources: databricks_schemaIt requires databricks_catalog, which requires metastore_id.However, I have databricks_workspace and I did not cre...

  • 2953 Views
  • 6 replies
  • 4 kudos
Latest Reply
Kaniz
Community Manager
  • 4 kudos

Hi @Łukasz Jaremek​ , Just a friendly follow-up. Do you still need help, or @Hubert Dudek (Customer)​ and @Atanu Sarkar​ 's response help you to find the solution? Please let us know.

  • 4 kudos
5 More Replies
Jeff1
by Contributor II
  • 2485 Views
  • 6 replies
  • 5 kudos

Resolved! Recommended database when using R in databricks

I'm new to integrating the sparklyr / R interface in databricks. In particular it appears that sparklyr and R commands and functions are dependent upon the type of dataframe one is working with (hive, Spark R etc). Is there a recommend best practice...

  • 2485 Views
  • 6 replies
  • 5 kudos
Latest Reply
Kaniz
Community Manager
  • 5 kudos

Hi @Jeff Reichman​ , Just a friendly follow-up. Do you still need help, or @Hubert Dudek (Customer)​ 's response help you to find the solution? Please let us know.

  • 5 kudos
5 More Replies
knight007
by New Contributor II
  • 2193 Views
  • 8 replies
  • 5 kudos

Containerized Databricks/Spark database

Hello. I'm fairly new to Databricks and Spark.I have a requirement to connect to Databricks using JDBC and that works perfectly using the driver I downloaded from the Databricks website ("com.simba.spark.jdbc.Driver")What I would like to do now is ha...

  • 2193 Views
  • 8 replies
  • 5 kudos
Latest Reply
Kaniz
Community Manager
  • 5 kudos

Hi @Gurps Bassi​ , Just a friendly follow-up. Do you still need help, or do @Hubert Dudek (Customer)​ and @Werner Stinckens​ 's responses help you find the solution? Please let us know.

  • 5 kudos
7 More Replies
Jan_A
by New Contributor III
  • 3289 Views
  • 5 replies
  • 5 kudos

Resolved! Move/Migrate database from dbfs root (s3) to other mounted s3 bucket

Hi,I have a databricks database that has been created in the dbfs root S3 bucket, containing managed tables. I am looking for a way to move/migrate it to a mounted S3 bucket instead, and keep the database name.Any good ideas on how this can be done?T...

  • 3289 Views
  • 5 replies
  • 5 kudos
Latest Reply
Kaniz
Community Manager
  • 5 kudos

Hi @Jan Ahlbeck​ , Did @DARSHAN BARGAL​ 's solution work in your case?

  • 5 kudos
4 More Replies
Logan_Data_Inc_
by New Contributor II
  • 1912 Views
  • 3 replies
  • 4 kudos

Resolved! Will the database be deleted when the cluster terminates ?

Hello,I'm working in Databricks Community Edition. So I will terminate my cluster after my work because anyways it will be terminated after 2 hours. I'm creating a database to store all my transformed data. Will the database be deleted when I termina...

  • 1912 Views
  • 3 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

@Hubert Dudek​ - Thanks for answering so quickly!!@Sriram Devineedi​ - If Hubert's answer solved the issue for you, would you be happy to mark his answer as best? That helps others know where to look.

  • 4 kudos
2 More Replies
William_Scardua
by Valued Contributor
  • 4772 Views
  • 5 replies
  • 12 kudos

The database and tables disappears when I delete the cluster

Hi guys,I have a trial databricks account, I realized that when I shutdown the cluster my databases and tables is disappear .. that is correct or thats is because my account is trial ?

  • 4772 Views
  • 5 replies
  • 12 kudos
Latest Reply
Prabakar
Esteemed Contributor III
  • 12 kudos

@William Scardua​ if it's an external hive metastore or Glue catalog you might be missing the configuration on the cluster. https://docs.databricks.com/data/metastores/index.htmlAlso as mentioned by @Hubert Dudek​ , if it's a community edition then t...

  • 12 kudos
4 More Replies
davidmory38
by New Contributor
  • 1204 Views
  • 1 replies
  • 0 kudos

Best Database for facial recognition/ Fast comparisons of Euclidean distance

Hello people,I'm trying to build a facial recognition application, and I have a working API, that takes in an image of a face and spits out a vector that encodes it. I need to run this on a million faces, store them in a db and when the system goes o...

  • 1204 Views
  • 1 replies
  • 0 kudos
Latest Reply
Dan_Z
Honored Contributor
  • 0 kudos

You could do this with Spark storing in parquet/Delta. For each face you would write out a record with a column for metadata, a column for the encoded vector array, and other columns for hashing. You could use a PandasUDF to do the distributed dista...

  • 0 kudos
User16790091296
by Contributor II
  • 1717 Views
  • 2 replies
  • 1 kudos

Database within a Database in Databricks

Is it possible to have a folder or database with a database in Azure Databricks? I know you can use the "create database if not exists xxx" to get a database, but I want to have folders within that database where I can put tables.

  • 1717 Views
  • 2 replies
  • 1 kudos
Latest Reply
brickster_2018
Esteemed Contributor
  • 1 kudos

The default location of a database will be in the /user/hive/warehouse/<databasename.db>. Irrespective of the location of the database the tables in the database can have different locations and they can be specified at the time of creation. Databas...

  • 1 kudos
1 More Replies
Labels