cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

mkwparth
by New Contributor III
  • 1385 Views
  • 4 replies
  • 1 kudos

Spark Failed to start: Driver unresponsive

Hi everyone,I'm encountering an intermittent issue when launching a Databricks pipeline cluster. Error messagecom.databricks.pipelines.common.errors.deployment.DeploymentException: Failed to launch pipeline cluster xxxx-xxxxxx-ofgxxxxx: Attempt to la...

  • 1385 Views
  • 4 replies
  • 1 kudos
Latest Reply
Gopichand_G
New Contributor II
  • 1 kudos

I have personally witnessed these kind of issues. Why these failures happen, usually as far as I have witnessed that the Driver Node might be unavailable or not responsive as you might have hit the maximum cpu or memory usage, may be your cache utili...

  • 1 kudos
3 More Replies
skooijman
by New Contributor II
  • 2465 Views
  • 4 replies
  • 7 kudos

dbt_project.yml won't load in databricks dbt job

We're running into issues with dbt jobs, which are not running anymore. The errors we receive suggest that the dbt_project.yml file cannot be found, while the profiles.yml can be found. We are running our dbt jobs with Databricks Workflows. We've tri...

  • 2465 Views
  • 4 replies
  • 7 kudos
Latest Reply
LokmenChouaya
New Contributor II
  • 7 kudos

Hello is there any updates please regarding the issue? I'm having the same problem on my prod 

  • 7 kudos
3 More Replies
Phani1
by Databricks MVP
  • 3635 Views
  • 1 replies
  • 0 kudos

Databricks AI (LLM) Functionalities: Data Privacy and Security

Hi Databricks Team,When leveraging Databricks' AI (LLM) functionalities, such as ai_query and ai_assistant, how does Databricks safeguard customer data and ensure privacy, safety, and security?Regards,Phani

  • 3635 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vinay_M_R
Databricks Employee
  • 0 kudos

Hello @Phani1, Databricks employs a multi-layered security approach to protect customer data when using AI functionalities like ai_query and Databricks Assistant. I am sharing below official documentation for your reference:https://learn.microsoft.co...

  • 0 kudos
Marvin_T
by New Contributor III
  • 21301 Views
  • 3 replies
  • 2 kudos

Resolved! Disabling query caching for SQL Warehouse

Hello everybody,I am currently trying to run some performance tests on queries in Databricks on Azure. For my tests, I am using a Classic SQL Warehouse in the SQL Editor. I have created two views that contain the same data but have different structur...

  • 21301 Views
  • 3 replies
  • 2 kudos
Latest Reply
Marvin_T
New Contributor III
  • 2 kudos

They are probably executing the same query plan now that you say it. And yes, restarting the warehouse does theoretically works but it isnt a nice solution.I guess I will do some restarting and build averages to have a good comparison for now

  • 2 kudos
2 More Replies
KristiLogos
by Contributor
  • 1141 Views
  • 2 replies
  • 0 kudos

Netsuite error - The driver could not open a JDBC connection. Check the URL

I'm trying to connect to Netsuite2 with the JDBC driver I added to my cluster. I'm testing this in my Sandbox Netsuite and I have the below code but it keeps saying:requirement failed: The driver could not open a JDBC connection. Check the URL: jdbc:...

  • 1141 Views
  • 2 replies
  • 0 kudos
Latest Reply
TheOC
Contributor III
  • 0 kudos

Hey @KristiLogos I had a little search online and found this which may be useful:https://stackoverflow.com/questions/79236996/pyspark-jdbc-connection-to-netsuite2-com-fails-with-failed-to-login-using-tbain short it seems that a token based connection...

  • 0 kudos
1 More Replies
seapen
by New Contributor II
  • 1245 Views
  • 1 replies
  • 0 kudos

[Question]: Get permissions for a schema containing backticks via the API

I am unsure if this is specific to the Java SDK, but i am having issues checking effective permissions on the following schema: databricks_dev.test_schema`In Scala i have the following example test: test("attempting to access schema with backtick") ...

  • 1245 Views
  • 1 replies
  • 0 kudos
Latest Reply
seapen
New Contributor II
  • 0 kudos

Update:Interestingly, if i URL encode _twice_ it appears to work, eg: test("attempting to access schema with backtick") { val client = new WorkspaceClient() client.config().setHost("redacted").setToken("redacted") val name = "databricks...

  • 0 kudos
lezwon
by Contributor
  • 832 Views
  • 2 replies
  • 1 kudos

Resolved! Databricks Serverless: Package import fails from notebook in subfolder after wheel installation

I have a Python package installed via wheel file in a Databricks serverless environment. The package imports work fine when my notebook is in the root directory, but fail when the notebook is in a subfolder. How can I fix this? src/ ├── datalake_util...

  • 832 Views
  • 2 replies
  • 1 kudos
Latest Reply
lezwon
Contributor
  • 1 kudos

It appears that there is a pre-installed package called datalake_utils available within Databricks. I had to rename my package to something else, and it worked like a charm.

  • 1 kudos
1 More Replies
AxelBrsn
by New Contributor III
  • 4942 Views
  • 5 replies
  • 1 kudos

Why materialized views are created in __databricks_internal ?

Hello, I have a question about why materialized views are created in "__databricks_internal" catalog?We specified catalog and schemas in the DLT Pipeline.

Data Engineering
catalog
Delta Live Table
materialized views
  • 4942 Views
  • 5 replies
  • 1 kudos
Latest Reply
Yogesh_Verma_
Contributor II
  • 1 kudos

Hello,Materialized views created by Delta Live Tables (DLT) are stored in the __databricks_internal catalog for a few key reasons:Separation: This keeps system-generated tables (like materialized views) separate from your own tables and views, so you...

  • 1 kudos
4 More Replies
fostermink
by New Contributor II
  • 2203 Views
  • 6 replies
  • 0 kudos

Spark aws s3 folder partition pruning doesn't work

 Hi, I have a use case where my spark job running on EMR AWS, and it is reading from a s3 path: some-bucket/some-path/region=na/days=1during my read, I pass DataFrame df = sparkSession.read().option("mergeSchema", true).parquet("some-bucket/some-path...

  • 2203 Views
  • 6 replies
  • 0 kudos
Latest Reply
lingareddy_Alva
Honored Contributor III
  • 0 kudos

In your case, Spark isn't automatically pruning partitions because:Missing Partition Discovery: For Spark to perform partition pruning when reading directly from paths (without a metastore table), you need to explicitly tell it about the partition st...

  • 0 kudos
5 More Replies
loinguyen3182
by New Contributor II
  • 2971 Views
  • 2 replies
  • 0 kudos

Spark Streaming Error Listing in GCS

I have faced a problem about error listing of _delta_log, when the spark read stream with delta format in GCS. This is the full log of the issue:org.apache.spark.sql.streaming.StreamingQueryException: Failed to get result: java.io.IOException: Error ...

  • 2971 Views
  • 2 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

The key contributing factors to this issue, according to internal investigations and customer tickets, include: Large Number of Log Files in _delta_log: Delta Lake maintains a JSON transaction log that grows with every commit. The more files present...

  • 0 kudos
1 More Replies
sunnyj
by New Contributor III
  • 888 Views
  • 1 replies
  • 0 kudos

delta live table pipeline

I am very confused about the answer, can anyone help me with this ?

sunnyj_0-1750764423110.png
Data Engineering
axal_r
axel_r
  • 888 Views
  • 1 replies
  • 0 kudos
Latest Reply
ilir_nuredini
Honored Contributor
  • 0 kudos

Hello sunnyjThe correct answer is B) At least one notebook library to be executed. This is because a Delta Live Tables pipeline requires at least one notebook to be assigned with it and that contains a table definition using @Dlt.table (or the sql sy...

  • 0 kudos
ankitmit
by New Contributor III
  • 4958 Views
  • 7 replies
  • 3 kudos

DLT Apply Changes

Hi,In DLT, how do we specify which columns we don't want to overwrite when using the “apply changes” operation in the DLT (in the attached example, we want to avoid overwriting the “created_time” column)?I am using this sample code dlt.apply_changes(...

  • 4958 Views
  • 7 replies
  • 3 kudos
Latest Reply
brunoillipronti
New Contributor II
  • 3 kudos

Same here, it's kinda ridiculous that apply_changes doesn't support a parameter to update certain columns... how come that is not a priority since this was released? 

  • 3 kudos
6 More Replies
ceediii
by New Contributor II
  • 1193 Views
  • 3 replies
  • 1 kudos

Resolved! Declarative Pipeline Asset Bundle Root Folder

Hi everyoneIn the new declarative pipeline UI (preview), we have the option to define a root folder.My ressource asset bundle is currently defined as:resources: jobs: my_job: name: "(${var.branch}) my_job" tasks: - task_key...

ceediii_0-1750694376542.png
  • 1193 Views
  • 3 replies
  • 1 kudos
Latest Reply
ilir_nuredini
Honored Contributor
  • 1 kudos

You are welcome, great it helped!Best, Ilir

  • 1 kudos
2 More Replies
QLA_SethParker
by New Contributor III
  • 1727 Views
  • 2 replies
  • 0 kudos

Resolved! Error Creating Table

We are a current Databricks customer (Azure Databricks) experiencing an issue when creating a table. We have an existing Metastore in the Central region.  All other Workspaces in this Metastore/Region are behind Private Endpoints.  We are trying to c...

SethParker02_0-1748985494231.png
  • 1727 Views
  • 2 replies
  • 0 kudos
Latest Reply
QLA_SethParker
New Contributor III
  • 0 kudos

Hi Lou,Thank you so much for your detailed reply, and I apologize for leaving this open for so long.  I got wrapped up in another project and am just getting back to this.I was able to resolve it, at least in my situation, last night, so I wanted to ...

  • 0 kudos
1 More Replies
jeremy98
by Honored Contributor
  • 2067 Views
  • 2 replies
  • 0 kudos

How to Optimize Batch Inference for Per-Item ML Models in Databricks

Hi everyone, I’m relatively new to Databricks. I worked with it a few months ago, and today I encountered an issue in our system. Basically, we have multiple ML models — one for each item — and we want to run inference in a more efficient way, ideall...

  • 2067 Views
  • 2 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Databricks offers unified capabilities for both real-time and batch inference across traditional ML models and large language models (LLMs) using Mosaic AI Model Serving and AI Functions (notably the ai_query function). For your use case (n items, n ...

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels