Data Engineering

Forum Posts

Sorted by:

by carlos_tasayco • Contributor

06-30-2025 12:09:26 PM

2369 Views
1 replies
0 kudos

DATABRICKS CLI SYNC SPECIFIC FILES

Hello,I am struggling with this problem I need to update databricks repo, to only sync some files according to documentation is possible:https://learn.microsoft.com/en-us/azure/databricks/dev-tools/cli/sync-commands#only-sync-specific-filesIn my work...

Data Engineering

2369 Views
1 replies
0 kudos

06-30-2025 12:09:26 PM

View Replies

Latest Reply

mark_ott
Databricks Employee

6 hours ago

0 kudos

You are trying to use the --include-from option with your .gitignore file to only sync specific files with the databricks sync command, but you are observing that all files get synced, not just the expected ones. The key issue is how the include/excl...

0 kudos

6 hours ago

by TomDeas • New Contributor

07-01-2025 5:46:38 AM

1925 Views
1 replies
0 kudos

Resource Throttling; Large Merge Operation - Recent Engine Change?

Morning all, hope you can help as I've been stumped for weeks.Question: have there been recent changes to the Databricks query engine, or Photon (etc) which may impact large sort operations?I have a Jobs pipeline that runs a series of notebooks which...

Data Engineering

MERGE

Performance Optimisation

Photon

Query Plan

serverless

1925 Views
1 replies
0 kudos

07-01-2025 5:46:38 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

6 hours ago

0 kudos

There have indeed been recent changes to the Databricks query engine and Photon, especially during the June 2025 platform releases, which may influence how large sort operations and resource allocation are handled in SQL pipelines similar to yours. S...

0 kudos

6 hours ago

by GJ2 • New Contributor II

02-17-2025 5:22:02 AM

10325 Views
9 replies
1 kudos

Install the ODBC Driver 17 for SQL Server

Hi,I am not a Data Engineer, I want to connect to ssas. It looks like it can be connected through pyodbc. however looks like I need to install "ODBC Driver 17 for SQL Server" using the following command. How do i install the driver on the cluster an...

Data Engineering

10325 Views
9 replies
1 kudos

02-17-2025 5:22:02 AM

View Replies

Latest Reply

kathrynshai
Visitor

9 hours ago

1 kudos

Hello Databricks Community,You’re correct—connecting to SSAS via pyodbc usually requires the “ODBC Driver 17 for SQL Server.” To install it on your cluster, the steps depend on your OS: for Linux, you typically add the Microsoft repository and instal...

1 kudos

9 hours ago

8 More Replies

by hgm251 • New Contributor

Thursday

93 Views
2 replies
2 kudos

badrequest: cannot create online table is being deprecated. creating new online table is not allowed

Hello!This seems so sudden that we cannot create online tables anymore? Is there a workaround to being able to create online tables temporarily as we need more time to move to synced tables? #online_tables

Data Engineering

93 Views
2 replies
2 kudos

Thursday

View Replies

Latest Reply

nayan_wylde
Esteemed Contributor

Thursday

2 kudos

Yes, the Databricks online tables (legacy) are being deprecated, and after January 15, 2026, you will no longer be able to access or create them.https://docs.databricks.com/aws/en/machine-learning/feature-store/migrate-from-online-tablesHere are few ...

2 kudos

Thursday

1 More Replies

by aav331 • New Contributor

Thursday

97 Views
2 replies
1 kudos

Resolved! Unable to install libraries from requirements.txt in a Serverless Job and spark_python_task

I am running into the following error while trying to deploy a serverless job running a spark_python_task with GIT as the source for the code. The Job was deployed as part of a DAB from a Github Actions Runner.Run failed with error message Library i...

Data Engineering

97 Views
2 replies
1 kudos

Thursday

View Replies

Latest Reply

aav331
New Contributor

yesterday

1 kudos

Thank you @Louis_Frolio ! I used Pattern C and it resolved it for me.

1 kudos

yesterday

1 More Replies

by saicharandeepb • New Contributor III

yesterday

61 Views
4 replies
0 kudos

Looking for Suggestions: Designed a Decision Tree to Recommend Optimal VM Types for Workloads

Hi everyone!I recently designed a decision tree model to help recommend the most suitable VM types for different kinds of workloads in Databricks. Thought Process Behind the Design:Determining the optimal virtual machine (VM) for a workload is heavil...

Data Engineering

61 Views
4 replies
0 kudos

yesterday

View Replies

Latest Reply

jameswood32
New Contributor III

yesterday

0 kudos

Your decision tree idea sounds solid! To improve it, consider including additional factors like network bandwidth, storage IOPS, and workload burst patterns. Also, think about cost-performance trade-offs and potential scaling requirements. Validating...

0 kudos

yesterday

3 More Replies

by Vetrivel • Contributor

02-27-2025 5:24:17 AM

3437 Views
1 replies
0 kudos

Federate AWS Cloudwatch logs to Databricks Unity Catalog

I am looking to integrate CloudWatch logs with Databricks. Our objective is not to monitor Databricks via CloudWatch, but rather to facilitate access to CloudWatch logs from within Databricks. If anyone has implemented a similar solution, kindly prov...

Data Engineering

3437 Views
1 replies
0 kudos

02-27-2025 5:24:17 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

yesterday

0 kudos

To access CloudWatch logs from within Databricks, you can set up an integration that enables Databricks to fetch, query, and analyze AWS CloudWatch log data directly—without configuring CloudWatch to monitor Databricks clusters. This approach is incr...

0 kudos

yesterday

by jeremy98 • Honored Contributor

02-27-2025 6:19:59 AM

3388 Views
1 replies
1 kudos

Environment set up in serveless notebook task

Hi community,Is there a way to install dependencies inside a notebook task using serveless compute using Databricks Asset Bundle? Is there a way to avoid installing everytime for each serverless task that compose a job the dependencies (or the librar...

Data Engineering

3388 Views
1 replies
1 kudos

02-27-2025 6:19:59 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

yesterday

1 kudos

For Databricks serverless compute jobs using Asset Bundles, custom dependencies (such as Python packages or wheel files) cannot be pre-installed on shared serverless infrastructure across job tasks as you can with traditional job clusters. Instead, d...

1 kudos

yesterday

by Maser_AZ • New Contributor II

03-03-2025 8:11:40 AM

3805 Views
1 replies
0 kudos

16.2 (includes Apache Spark 3.5.2, Scala 2.12) cluster in community edition taking long time

16.2 (includes Apache Spark 3.5.2, Scala 2.12) cluster in community edition taking long time to start.I m trying to launch 16.2 DBR but it seems the cluster which is one node is taking long time . Is this a bug in the community edition ?Here is the u...

Data Engineering

Databricks

3805 Views
1 replies
0 kudos

03-03-2025 8:11:40 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

yesterday

0 kudos

The long startup time for a Databricks Runtime 16.2 (Apache Spark 3.5.2, Scala 2.12) single-node cluster in Databricks Community Edition is a known issue and not unique to your setup. Many users have reported this situation, with some clusters taking...

0 kudos

yesterday

by Abishrp • Contributor

03-11-2025 9:50:26 PM

2986 Views
1 replies
0 kudos

Product code of Databricks in AWS CUR report

I need to know what is the productCode of Databricks in CUR report. Whether the productCode is same for all user?

Data Engineering

2986 Views
1 replies
0 kudos

03-11-2025 9:50:26 PM

View Replies

Latest Reply

mark_ott
Databricks Employee

yesterday

0 kudos

In the AWS Cost and Usage Report (CUR), the productCode for Databricks is used to identify costs attributed to Databricks usage within your AWS environment. The value that appears in the lineItem/ProductCode column for Databricks is typically "Databr...

0 kudos

yesterday

by Nick_Pacey • New Contributor III

03-11-2025 1:15:14 AM

3103 Views
1 replies
0 kudos

Foreign Catalog error connecting to SQL Server 2008 R2

Hi,Is there a limitation or know issue when creating a foreign catalog to a SQL Server 2008 R2?We are successfully able to connect to this SQL Server through a JDBC connection string. To make this work, we have to switch the Java encrypt flag to fal...

Data Engineering

3103 Views
1 replies
0 kudos

03-11-2025 1:15:14 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

yesterday

0 kudos

There are known limitations and issues when connecting to SQL Server 2008 R2, particularly around encryption and JDBC settings, which can manifest as errors in federated catalog operations—even though a direct JDBC connection might succeed if the "en...

0 kudos

yesterday

by Kabil • New Contributor

03-17-2025 10:20:22 AM

3190 Views
1 replies
0 kudos

useing dlt metadata as runtime parameter

i have started using DLT pipeline, and i have common code which is used by multiple DLT pipeline. now i need to read metadata information like name of the pipeline and start time of the pipeline during run time, but since im using common code and pip...

Data Engineering

3190 Views
1 replies
0 kudos

03-17-2025 10:20:22 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

yesterday

0 kudos

To dynamically access metadata like the pipeline name and start time at runtime in your common code for Delta Live Tables (DLT) pipelines, you should leverage runtime context and built-in metadata features provided by the DLT or related orchestrators...

0 kudos

yesterday

by TamD • Contributor

03-18-2025 4:04:29 PM

3157 Views
1 replies
0 kudos

ModuleNotFoundError Importing fuction modules to DLT pipelines

Following best practice, we want to avoid reusing code by putting commonly used transformations into function libraries and then importing and calling those functions where required.We also want to follow Databricks recommendations to use serverless ...

Data Engineering

3157 Views
1 replies
0 kudos

03-18-2025 4:04:29 PM

View Replies

Latest Reply

mark_ott
Databricks Employee

yesterday

0 kudos

You are correctly following Databricks’ recommendation to store shared code in Python files and import them into your notebooks, especially for Delta Live Tables (DLT) pipelines and serverless environments. However, import path issues are common, par...

0 kudos

yesterday

by cszczotka • New Contributor III

03-19-2025 12:22:54 AM

3305 Views
1 replies
0 kudos

Delta sharing open issue with access data on storage

Hi, I have configured delta sharing for external consumer in Azure Databricks. Azure Databricks and storage account are in VNET, no public access. The storage account has also disabled account key access and shared key authorization.I'm running delt...

Data Engineering

3305 Views
1 replies
0 kudos

03-19-2025 12:22:54 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

yesterday

0 kudos

Delta Sharing in Azure Databricks allows sharing datasets across clouds and with external consumers, but when used in a tightly controlled network environment (private endpoints, no public access, restricted storage account authentication), it behave...

0 kudos

yesterday

by dc-rnc • Contributor

05-06-2025 8:53:19 AM

3078 Views
2 replies
2 kudos

Issue pulling Docker Image on Databricks Cluster through Azure Container Registry

Hi Community.Essentially, we're using the ACR to push our custom Docker Image, then we would like to pull it to create a Databricks cluster. However, during the cluster creation, we got the following error:I'm convinced we tried to authenticate in al...

Data Engineering

3078 Views
2 replies
2 kudos

05-06-2025 8:53:19 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

yesterday

2 kudos

You are experiencing an authentication issue when trying to use a custom Docker image from Azure Container Registry (ACR) with Databricks clusters, despite successfully using admin tokens and service principals with acrpull permissions in other conte...

2 kudos

yesterday

1 More Replies

Databricks Community

Forum Posts

DATABRICKS CLI SYNC SPECIFIC FILES

Resource Throttling; Large Merge Operation - Recent Engine Change?

Install the ODBC Driver 17 for SQL Server

badrequest: cannot create online table is being deprecated. creating new online table is not allowed

Resolved! Unable to install libraries from requirements.txt in a Serverless Job and spark_python_task

Looking for Suggestions: Designed a Decision Tree to Recommend Optimal VM Types for Workloads

Federate AWS Cloudwatch logs to Databricks Unity Catalog

Environment set up in serveless notebook task

16.2 (includes Apache Spark 3.5.2, Scala 2.12) cluster in community edition taking long time

Product code of Databricks in AWS CUR report

Foreign Catalog error connecting to SQL Server 2008 R2

useing dlt metadata as runtime parameter

ModuleNotFoundError Importing fuction modules to DLT pipelines

Delta sharing open issue with access data on storage

Issue pulling Docker Image on Databricks Cluster through Azure Container Registry

Join Us as a Local Community Builder!

Delta live table not showing in workspace (Azure d...

Unable to install libraries from requirements.txt ...

Databricks Bundle Validation Error After CLI Upgra...

DABs with multi github sources

DLT Streaming With Watermark fails, suggesting I s...