cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

lnsnarayanan
by New Contributor II
  • 10788 Views
  • 8 replies
  • 12 kudos

Resolved! I cannot see the Hive databases or tables once I terminate the cluster and use another cluster.

I am using Databricks community edition for learning purposes. I created some Hive-managed tables through spark sql as well as with df.saveAsTable options. But when I connect to a new cluser, "Show databases" only returns the default database....

  • 10788 Views
  • 8 replies
  • 12 kudos
Latest Reply
dhpaulino
New Contributor II
  • 12 kudos

As the file still in the dbfs you can just recreate the reference of your tables and continue the work, with something like this:db_name = "mydb" from pathlib import Path path_db = f"dbfs:/user/hive/warehouse/{db_name}.db/" tables_dirs = dbutils.fs.l...

  • 12 kudos
7 More Replies
v01d
by New Contributor III
  • 1373 Views
  • 1 replies
  • 0 kudos

Databricks Auto Loader authorization exception

Hello,I'm trying to process the DB Auto Loader with notifications=true option (Azure ADLS) and get not clear authorization error. The exception log attached.Looks like all required permission are provided to the service principle: 

Screenshot_2024-06-01_at_14_32_06.png
  • 1373 Views
  • 1 replies
  • 0 kudos
AkasBala
by New Contributor III
  • 3044 Views
  • 3 replies
  • 0 kudos

Primary Key not working as expected on Unity Catalog delta tables

Hi @Chetan Kardekar. I noticed that you had commented on Primary key on Delta tables. Do we have that feature already released in DataBricks Premium. I have a Unity Catalog and I created a table with Primary Key, though it doesnt act like Primary Key...

  • 3044 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Bala Akas​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 0 kudos
2 More Replies
paranoid_jvm
by New Contributor II
  • 4191 Views
  • 0 replies
  • 1 kudos

Spark tasks getting stick on one executor

Hi All,I am running a Spark job using cluster with 8 executor with 8 cores each. The job involves execution of UDF. The job processes rows in few 100 thousands. When I run the job, each executor is assigned 8 job each. Usually the job succeeds in les...

  • 4191 Views
  • 0 replies
  • 1 kudos
Erik
by Valued Contributor III
  • 8262 Views
  • 7 replies
  • 10 kudos

Resolved! How to use dbx for local development.

​Databricks connect is a program which allows you to run spark code locally, but the actual execution happens on a spark cluster. Noticeably, it allows you to debug and step through the code locally in your own IDE. Quite useful. But it is now beeing...

  • 8262 Views
  • 7 replies
  • 10 kudos
Latest Reply
FeliciaWilliam
Contributor
  • 10 kudos

I found answers to my questions here

  • 10 kudos
6 More Replies
abaet
by New Contributor
  • 509 Views
  • 0 replies
  • 0 kudos

Random NoClassDefFound Error when running job

We are running a job on a cluster with DBR 10.4 LTS Spark 3.2.1 and scala 2.12. Cluster is using 4 workers (spot instances) .The driver is not a spot instanceRandomly ( only on one environment and not all executions), we are getting the following err...

  • 509 Views
  • 0 replies
  • 0 kudos
fury-kata
by New Contributor II
  • 1161 Views
  • 1 replies
  • 0 kudos

ModuleNotFoundError when run with foreachBatch on serverless mode

I using Notebooks to do some transformations I install a new whl:  %pip install --force-reinstall /Workspace/<my_lib>.whl %restart_python  Then I  successfully import the installed lib  from my_lib.core import test  However when I run my code with fo...

  • 1161 Views
  • 1 replies
  • 0 kudos
wilco
by New Contributor II
  • 2172 Views
  • 2 replies
  • 0 kudos

SQL Warehouse: Retrieving SQL ARRAY Type via JDBC driver

Hi all,we are currently running into the following issuewe are using serverless SQL warehousein a JAVA application we are using the latest Databricks JDBC driver (v2.6.36)we are querying the warehouse with a collect_list function, which should return...

  • 2172 Views
  • 2 replies
  • 0 kudos
Latest Reply
KTheJoker
Databricks Employee
  • 0 kudos

Hey Wilco, The answer is no, ODBC/JDBC don't support complex types so these need to be compressed into strings over the wire (usually in JSON representation) and rehydrated on the client side into a complex object.

  • 0 kudos
1 More Replies
source2sea
by Contributor
  • 3113 Views
  • 2 replies
  • 0 kudos

Resolved! ERROR RetryingHMSHandler: NoSuchObjectException(message:There is no database named global_temp)

ERROR RetryingHMSHandler: NoSuchObjectException(message:There is no database named global_temp)should one create it in the work space manually via UI? and how?would it get overwritten if work space is created via terraform?I use 10.4 LTS runtime.

  • 3113 Views
  • 2 replies
  • 0 kudos
Latest Reply
ashish2007g
New Contributor II
  • 0 kudos

I am experiencing significant delay on my streaming. I am using changefeed connector. Its processing streaming batch very frequently but experiencing sudden halt and shows no active stage for longer time. I observed below exception continuously promp...

  • 0 kudos
1 More Replies
kskistad
by New Contributor III
  • 5506 Views
  • 2 replies
  • 4 kudos

Resolved! Streaming Delta Live Tables

I'm a little confused about how streaming works with DLT. My first questions is what is the difference in behavior if you set the pipeline mode to "Continuous" but in your notebook you don't use the "streaming" prefix on table statements, and simila...

  • 5506 Views
  • 2 replies
  • 4 kudos
Latest Reply
Harsh141220
New Contributor II
  • 4 kudos

Is it possible to have custom upserts in streaming tables in a delta live tables pipeline?Use case: I am trying to maintain a valid session based on timestamp column and want to upsert to the target table.Tried going through the documentations but dl...

  • 4 kudos
1 More Replies
sreeyv
by New Contributor II
  • 853 Views
  • 2 replies
  • 0 kudos

Unable to execute update statement through Databricks Notebook

I am unable to execute update statements through Databricks Notebook, getting this error message "com.databricks.sql.transaction.tahoe.actions.InvalidProtocolVersionException: Delta protocol version is too new for this version of the Databricks Runti...

  • 853 Views
  • 2 replies
  • 0 kudos
Latest Reply
sreeyv
New Contributor II
  • 0 kudos

This is resolved, this happens when a Column in the table has a GENERATED BY DEFAULT AS IDENTITY defined. When you remove this column, it works fine

  • 0 kudos
1 More Replies
deepu
by New Contributor II
  • 1190 Views
  • 1 replies
  • 1 kudos

performance issue with SIMBA ODBC using SSIS

i was trying to upload data into a table in hive_metastore using SSIS using SIMBA ODBC driver. The data set is huge (1.2 million records and 20 columns) , it is taking more than 40 mins to complete. is there an config change to improve the load time.

  • 1190 Views
  • 1 replies
  • 1 kudos
Latest Reply
NandiniN
Databricks Employee
  • 1 kudos

Looks like a slow data upload into a table in hive_metastore using SSIS and the SIMBA ODBC driver. This could be due to a variety of factors, including the size of your dataset and the configuration of your system. One potential solution could be to ...

  • 1 kudos
Ramseths
by New Contributor
  • 845 Views
  • 1 replies
  • 0 kudos

Wrong Path Databricks Repos

In a Databricks environment, I have cloned a repository that I have in Azure DevOps Repos, the repository is inside the path:Workspace/Repos/<user_mail>/my_repo.Then when I create a Python script that I want to call in a notebook using an import: imp...

  • 845 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @Ramseths , If your notebook and script are in the same path, it would have picked the same relative path. Is your notebook located in /databricks/driver? Thanks!

  • 0 kudos
JonLaRose
by New Contributor III
  • 2396 Views
  • 2 replies
  • 0 kudos

Adding custom Jars to SQL Warehouses

Hi there,I want to add custom JARs to an SQL warehouse (Pro if that matters) like I can in an interactive cluster, yet I don't see a way.Is that a degraded functionality when transitioning to a SQL warehouse, or have I missed something? Thank you. 

  • 2396 Views
  • 2 replies
  • 0 kudos
Latest Reply
SparkJun
Databricks Employee
  • 0 kudos

ADD JAR is a SQL syntax for Databricks runtime, it does not work for DBSQL/warehouse. DBSQL would throw this error: [NOT_SUPPORTED_WITH_DB_SQL] LIST JAR(S) is not supported on a SQL warehouse. SQLSTATE: 0A000. This feature is not supported as of now....

  • 0 kudos
1 More Replies
leungi
by Contributor
  • 2838 Views
  • 6 replies
  • 1 kudos

Resolved! Unable to add column comment in Materialized View (MV)

The following doc suggests the ability to add column comments during MV creation via the `column list` parameter.Thus, the SQL code below is expected to generate a table where the columns `col_1` and `col_2` are commented; however, this is not the ca...

  • 2838 Views
  • 6 replies
  • 1 kudos
Latest Reply
raphaelblg
Databricks Employee
  • 1 kudos

@leungi you've shared the python language reference. This is the SQL Reference from where I've based my example.

  • 1 kudos
5 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels