Data Engineering

Forum Posts

Sorted by:

by mgcasas-aws • New Contributor

06-16-2025 3:50:01 PM

1879 Views
1 replies
1 kudos

Resolved! Azure Databricks Serverless private connection to S3 bucket

I'm looking for technical references to connect an Azure Databricks serverless workspace to an S3 bucket over a private site-to-site VPN connection. Found the following to connect AWS (consumer) to Azure (provider), but I'm looking for the other way....

Data Engineering

1879 Views
1 replies
1 kudos

06-16-2025 3:50:01 PM

View Replies

Latest Reply

Sai_Ponugoti
Databricks Employee

09-26-2025 8:39:25 AM

1 kudos

Hello @mgcasas-aws Thank you for your question! We’re currently working on a solution for private cross-cloud Delta Sharing (Azure → AWS). In the meantime, here’s a possible approach: Update your Azure Storage Account network settings from private e...

1 kudos

09-26-2025 8:39:25 AM

by Rainer • New Contributor

07-24-2025 2:52:46 AM

589 Views
2 replies
0 kudos

pyspark.testing.assertSchemaEqual() ignoreColumnOrder parameter exists in 3.5.0 only on Databricks

Hi, I am using the pyspark.testing.assertSchemaEqual() function in my code using the ignoreColumnOrder parameter that is available since pyspark 4.0.0. https://spark.apache.org/docs/4.0.0/api/python/reference/api/pyspark.testing.assertSchemaEqual.htm...

Data Engineering

589 Views
2 replies
0 kudos

07-24-2025 2:52:46 AM

View Replies

Latest Reply

saurabh18cs
Honored Contributor II

09-26-2025 7:37:07 AM

0 kudos

Hi @Rainer When you use Databricks Connect, your local code is executed against the Databricks cluster, which uses the Databricks Runtime’s PySpark, not your local PySpark installation. meaning your master driver node is also running on remote comput...

0 kudos

09-26-2025 7:37:07 AM

1 More Replies

by karthikmani • New Contributor

09-25-2025 12:57:50 PM

314 Views
1 replies
0 kudos

git actions to deploy dabs

HI All,I am trying to create a dabs deployment via git actions. However I am getting below error constantly. Requesting your suggestion on what I am doing wrong here? Thanks.Note we are using OIDC authentication from git. Our company has disabled tok...

Data Engineering

314 Views
1 replies
0 kudos

09-25-2025 12:57:50 PM

View Replies

Latest Reply

saurabh18cs
Honored Contributor II

09-26-2025 7:27:00 AM

0 kudos

Hi @karthikmani Databricks CLI is not able to find or use the right authentication method/environment variables for OIDC. can you try with this : In your databricks.yaml, you have auth_type: github-oidc and in your workflow you use DATABRICKS_AUTH_TY...

0 kudos

09-26-2025 7:27:00 AM

by rohith_23 • New Contributor II

09-22-2025 5:15:59 AM

429 Views
3 replies
2 kudos

Resolved! org.apache.hadoop.hive.ql.metadata.HiveException: MetaException

Hi Data Enthusiasts,I have been facing few errors in SQL warehouse for quiet a long time and its happening pretty randomly.We checked query runs and captured the errors below.I believe this is something to do with hive. And I am facing this when ther...

Data Engineering

429 Views
3 replies
2 kudos

09-22-2025 5:15:59 AM

View Replies

Latest Reply

NandiniN
Databricks Employee

09-26-2025 4:51:20 AM

2 kudos

Hi @rohith_23 , These errors all relate to problems communicating with the Hive Metastore Service (HMS), which is the central component to store metadata (schemas, table locations, column types, etc.) about your tables. The core of the issue in all t...

2 kudos

09-26-2025 4:51:20 AM

2 More Replies

by GastonClynhens • New Contributor III

09-24-2025 9:26:57 AM

469 Views
1 replies
1 kudos

Resolved! Power BI refresh history info is different from ADF monitor info

In Azure Data Factory, I have a pipeline defined with Bronze - Silver - Gold layers + the final step 4 entails the refresh of a Power BI semantic model.This final step is executed via a Databricks notebook and contains the following tasks:# getting p...

Data Engineering

469 Views
1 replies
1 kudos

09-24-2025 9:26:57 AM

View Replies

Latest Reply

GastonClynhens
New Contributor III

09-26-2025 2:41:25 AM

1 kudos

considering the logging of the executed notebook:Failed to refresh dataset: {"error":{"code":"ItemNotFound","message":"Dataset \"xxxxxxxxxxxxxxxx\" is not found!Please verify datasetId is correct and user have sufficient permissions."}}The dataset wa...

1 kudos

09-26-2025 2:41:25 AM

by paulchen • New Contributor

07-08-2025 10:23:43 AM

1919 Views
1 replies
1 kudos

Resolved! Service principle used in Bitbucket CICD pipelines not working

The Databricks Asset Bundle is used for the Bitbucket CICD pipelines.The service principle is used in both of the local Databricks configuration and Bitbucket CICD environment.The service principle is only working in the local environment for deploym...

Data Engineering

1919 Views
1 replies
1 kudos

07-08-2025 10:23:43 AM

View Replies

Latest Reply

sarahbhord
Databricks Employee

09-25-2025 10:36:29 AM

1 kudos

Hey PaulChen Having your Databricks service principal (SP) work locally but fail in Bitbucket CI/CD usually means environment variables aren’t set up right, or the pipeline is falling back to an unexpected config. Quick checklist: Set all SP creden...

1 kudos

09-25-2025 10:36:29 AM

by cz0 • New Contributor

07-05-2025 12:41:20 AM

2353 Views
2 replies
1 kudos

Monitoring structured streaming and Log4J properties

Hi guys, I would like to monitor streaming job on metrics like delay, processing time and more. I found this documentation but I get message on starting and terminating phase and not while I process a record. The job is a pretty easy streaming which ...

Data Engineering

2353 Views
2 replies
1 kudos

07-05-2025 12:41:20 AM

View Replies

Latest Reply

saurabh18cs
Honored Contributor II

09-25-2025 8:05:23 AM

1 kudos

Hi @cz0 The StreamingQueryListener in Spark is designed to give you metrics at the micro-batch level (not per individual record), which is typical for Spark Structured StreamingonQueryStarted: Called when the streaming job starts.onQueryProgress: Ca...

1 kudos

09-25-2025 8:05:23 AM

1 More Replies

by jeff2 • New Contributor

07-15-2025 6:27:53 PM

1500 Views
1 replies
1 kudos

Resolved! When embedding redash, is it possible to make it visible without an account?

Same as the title.I created a redash in Databricks and want to embed it to show it in another portal. However, other users have accounts in the portal but not in Databricks.In this case, is it possible to show Redash to all these users?

Data Engineering

1500 Views
1 replies
1 kudos

07-15-2025 6:27:53 PM

View Replies

Latest Reply

mark_ott
Databricks Employee

09-25-2025 8:03:29 AM

1 kudos

It is possible to embed a Redash dashboard created in Databricks into another portal so that users without Databricks accounts can view it, but this requires specific setup and permission management. How Embedding Works To show the dashboard to users...

1 kudos

09-25-2025 8:03:29 AM

by TinaDouglass • New Contributor

09-03-2025 1:24:30 PM

345 Views
3 replies
1 kudos

Resolved! Summarized Data from Source system into Bronze

Hello,We are just starting with Databricks. Quick question. We have a table in our legacy source system that summarizes values that are used on legacy reports and used for payment in our legacy system. The business wants a dashboard on our new plat...

Data Engineering

345 Views
3 replies
1 kudos

09-03-2025 1:24:30 PM

View Replies

Latest Reply

Advika
Databricks Employee

09-25-2025 7:59:41 AM

1 kudos

Hello @TinaDouglass! Did the suggestions shared above help address your concern? If so, please consider marking one as the accepted solution.

1 kudos

09-25-2025 7:59:41 AM

2 More Replies

by juanjomendez96 • Contributor

09-18-2025 12:03:03 AM

1016 Views
16 replies
4 kudos

Resolved! Control Databricks Platform version

Hello there! I have noticed that my Databricks UI has been changed from time to time, I was wondering how can I control the Databricks Platform version so I don't keep having new changes and new ways/names in my UI. I have found a release page https:...

Data Engineering

1016 Views
16 replies
4 kudos

09-18-2025 12:03:03 AM

View Replies

Latest Reply

georgeb
New Contributor II

09-24-2025 3:53:29 AM

4 kudos

Hi, Can we get an official feedback when the issue with adding new users/groups in Databricks apps UI (not working) will be fixed? I tried with python SDK as well and it does not work. The issue was posted previously and applies to my case as well....

4 kudos

09-24-2025 3:53:29 AM

15 More Replies

by bhargavabasava • New Contributor III

09-25-2025 12:21:06 AM

297 Views
1 replies
0 kudos

Support for JDBC writes from serverless compute

Hi team,Are there any plans in place to support JDBC writes using serverless compute.

Data Engineering

297 Views
1 replies
0 kudos

09-25-2025 12:21:06 AM

View Replies

Latest Reply

saurabh18cs
Honored Contributor II

09-25-2025 12:30:07 AM

0 kudos

Databricks Serverless SQL supports JDBC reads (queries) via the Databricks SQL endpoints, but JDBC writes (inserts/updates) directly from serverless compute are not generally supported. This is a common limitation because serverless SQL endpoints are...

0 kudos

09-25-2025 12:30:07 AM

by y_sanjay • New Contributor

09-24-2025 12:13:35 PM

2360 Views
2 replies
2 kudos

Temporary view

Hi,I wrote a query to create temp view in my catalog, query execution was successful and returned the result as 'OK' in SQL editor window. However, when I executed the command 'Show Tables' and' Select * {temp_view}', it's not identifying the view. W...

Data Engineering

2360 Views
2 replies
2 kudos

09-24-2025 12:13:35 PM

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

09-24-2025 12:58:11 PM

2 kudos

Hi @y_sanjay ,I guess that is somehow related to how sessions are managed within SQL Editor. For instance when I ran following queries in SQL Editor all at once it worked and I've got 3 result sets:1) First result set with OK status - which means tha...

2 kudos

09-24-2025 12:58:11 PM

1 More Replies

by Data_NXT • New Contributor III

09-24-2025 10:22:07 AM

487 Views
3 replies
5 kudos

Resolved! Databricks Business dashboards - Interactive cluster Total dollar spent

I'm working on Databricks Business Dashboards and trying to calculate interactive cluster compute time and total dollar spend per workspace.As per standard understanding, the total dollar spent = Interactive Clusters + Job Clusters + SQL Warehouses.I...

Data Engineering

487 Views
3 replies
5 kudos

09-24-2025 10:22:07 AM

View Replies

Latest Reply

nayan_wylde
Honored Contributor III

09-24-2025 11:21:20 AM

5 kudos

Also the system table will not provide you the exact dollar amount that you spend in an interactive compute. Here is the cost breakdown for running interactive compute:ComponentDescriptionCost SourceDBU CostBased on workload type and tierDatabricksV...

5 kudos

09-24-2025 11:21:20 AM

2 More Replies

by Sunil_Poluri • New Contributor

07-07-2025 3:06:41 AM

1356 Views
1 replies
1 kudos

Resolved! Unexpected Schema ID Folder Creation in Unity Catalog External Location

I've set up Unity Catalog with an external location pointing to a storage account. For each schema, I’ve configured a dedicated container path. For example:abfss://schemas@<storage_account>.dfs.core.windows.net/_unityStorage/schemas/<schema_id>When I...

Data Engineering

1356 Views
1 replies
1 kudos

07-07-2025 3:06:41 AM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

09-24-2025 10:01:39 AM

1 kudos

Hey @Sunil_Poluri , I did some research (learned a few things) and here is what I found. Unity Catalog manages cloud storage mapping for schemas using internal IDs (schema_id) to ensure data isolation, governance, and uniqueness within a metastore—e...

1 kudos

09-24-2025 10:01:39 AM

by Anubhav2011 • New Contributor II

08-29-2025 6:23:11 PM

798 Views
5 replies
4 kudos

What is the Power of DLT Pipeline to read streaming data

I am getting thousands of records every second in my bronze table from Qlik and every second the bronze table is getting truncated and load with new data by Qlik itself. How do I process this much data every second to my silver streaming table before...

Data Engineering

798 Views
5 replies
4 kudos

08-29-2025 6:23:11 PM

View Replies

Latest Reply

Krishna_S
Databricks Employee

09-24-2025 8:45:26 AM

4 kudos

The Apply Changes API is getting deprecated. The AUTO CDC APIs replace the APPLY CHANGES APIs, and have the same syntax. The APPLY CHANGES APIs are still available, but Databricks recommends using the AUTO CDC APIs in their place. Please refer to the...

4 kudos

09-24-2025 8:45:26 AM

4 More Replies

Databricks Community

Forum Posts

Resolved! Azure Databricks Serverless private connection to S3 bucket

pyspark.testing.assertSchemaEqual() ignoreColumnOrder parameter exists in 3.5.0 only on Databricks

git actions to deploy dabs

Resolved! org.apache.hadoop.hive.ql.metadata.HiveException: MetaException

Resolved! Power BI refresh history info is different from ADF monitor info

Resolved! Service principle used in Bitbucket CICD pipelines not working

Monitoring structured streaming and Log4J properties

Resolved! When embedding redash, is it possible to make it visible without an account?

Resolved! Summarized Data from Source system into Bronze

Resolved! Control Databricks Platform version

Support for JDBC writes from serverless compute

Temporary view

Resolved! Databricks Business dashboards - Interactive cluster Total dollar spent

Resolved! Unexpected Schema ID Folder Creation in Unity Catalog External Location

What is the Power of DLT Pipeline to read streaming data

Join Us as a Local Community Builder!

Unable to login to community edition

Learning Path for Spark Developer Associate

DLT Pipeline Stopped working

Migrating Talend ETL Jobs to Databricks – Best Pra...

Conversational Agent App integration with genie in...