Data Engineering

Forum Posts

Sorted by:

by RakeshRakesh_De • New Contributor III

06-11-2025 8:10:19 PM

2285 Views
3 replies
1 kudos

Databricks Free Edition - sql server connector not working-

I am trying to explore New Databricks Free edition but SQL Server connector Ingestion pipeline not able to set up through UI.. Its showing error that --Serverless Compute Must be Enabled for the workspace,But Free Edition only have Serverless Option ...

Data Engineering

FreeEdition

LakeFlow

2285 Views
3 replies
1 kudos

06-11-2025 8:10:19 PM

View Replies

Latest Reply

Saf4Databricks
New Contributor III

5 hours ago

1 kudos

Hi @RakeshRakesh_De The error is misleading. As mentioned in the second row of the table here the gateway runs on classic compute, and the ingestion pipeline runs on serverless compute (mentioned in the third row of the same table linked above). Hop...

1 kudos

5 hours ago

2 More Replies

by austinoyoung • New Contributor III

7 hours ago

15 Views
2 replies
1 kudos

Resolved! oracle sequence number

Dear All,I am trying to use jdbc driver to connect to an oracle database and append a new record to a table. The table has a column needs to be populated with a sequence number. I've been trying to use select `<sequence_name>.nextval` to get the sequ...

Data Engineering

15 Views
2 replies
1 kudos

7 hours ago

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

6 hours ago

1 kudos

Hey @austinoyoung , Short answer: Don’t try to pull the sequence in your Spark insert. Let Oracle assign it. Why this happens (ORA-02287: sequence number not allowed here Spark’s JDBC writer generates parameterized INSERT statements like: INSERT INT...

1 kudos

6 hours ago

1 More Replies

by bidek56 • New Contributor III

10 hours ago

14 Views
1 replies
0 kudos

Resolved! Stack traces as standard error in job logs

When using DBR 16.4, I am seeing a lot of Stack traces as standard error in jobs, any idea why they are showing up and how to turn then off? Thx"FlagSettingCacheMetricsTimer" id=18 state=WAITING- waiting on <0x2d1573c6> (a java.util.TaskQueue)- locke...

Data Engineering

14 Views
1 replies
0 kudos

10 hours ago

View Replies

Latest Reply

bidek56
New Contributor III

6 hours ago

0 kudos

spark.databricks.driver.disableJvmThreadDump=trueThis setting will remove the ST.

0 kudos

6 hours ago

by crami • New Contributor II

8 hours ago

12 Views
1 replies
0 kudos

Quota Limit Exhausted Error when Creating declarative pipeline

I am trying to develop a declarative pipeline. As per platform policy, I cannot use serverless, reason, I am using asset bundle to create declarative pipeline. In the bundle, I am trying to specify compute for the pipeline. However, I am constantly f...

Data Engineering

12 Views
1 replies
0 kudos

8 hours ago

View Replies

Latest Reply

Khaja_Zaffer
Contributor III

7 hours ago

0 kudos

Hello @crami Good day!!As the error tells. you need to increase the VM size, i know you have enough things in your place but spot fallback + Photon + autoscale triggers the failure. Go to Azure Portal → Subscriptions → Usage + quotasFilter: Provide...

0 kudos

7 hours ago

by saab123 • New Contributor II

02-27-2025 10:17:57 AM

3222 Views
1 replies
0 kudos

Not able to export maps in dashboards

When we export a dashboard with maps, the map background doesn't show up in the pdf.

Data Engineering

3222 Views
1 replies
0 kudos

02-27-2025 10:17:57 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

8 hours ago

0 kudos

When exporting a Databricks dashboard with maps to PDF, it is a known issue that the map background sometimes does not appear in the exported PDF file. This problem has been discussed in the Databricks community as of early 2025, and appears to be a ...

0 kudos

8 hours ago

by SrihariB • New Contributor

02-25-2025 2:59:41 PM

3660 Views
1 replies
0 kudos

Read from multiple sources in a single stream

Hey all, I am trying to read data from multiple s3 locations using a single stream DLT pipeline and loading data into a single target. Here is the scenario. S3 Locations: Below are my s3 raw locations with change in the directory names at the end. Ba...

Data Engineering

3660 Views
1 replies
0 kudos

02-25-2025 2:59:41 PM

View Replies

Latest Reply

mark_ott
Databricks Employee

8 hours ago

0 kudos

You are using Databricks Autoloader (cloudFiles) within a Delta Live Tables (DLT) pipeline to ingest streaming Parquet data from multiple S3 directories with a wildcard pattern, and you want to ensure all matching directories’ data is included in a s...

0 kudos

8 hours ago

by Mauro • New Contributor II

02-20-2025 4:31:05 AM

3217 Views
1 replies
0 kudos

DLT change in hive metastore destination to unity catalog

A change recently came out in which Databricks necessarily requires using the Unity Catalog as the output of a DLT because previously it was HiveMetaStore. At first I was working using CDC plus expectations which resulted in the "allow_expectations_c...

Data Engineering

3217 Views
1 replies
0 kudos

02-20-2025 4:31:05 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

8 hours ago

0 kudos

Databricks has recently enforced Unity Catalog as the output target for Delta Live Tables (DLT), replacing the legacy Hive Metastore approach. As a result, the familiar "allow_expectations_col" column, which was automatically added to help track and ...

0 kudos

8 hours ago

by Yuppp • New Contributor

02-19-2025 6:35:15 AM

3740 Views
1 replies
0 kudos

Need help with setting up ForEach task in Databricks

Hi everyone,I have a workflow involving two notebooks: Notebook A and Notebook B. At the end of Notebook A, we generate a variable number of files, let's call it N. I want to run Notebook B for each of these N files.I know Databricks has a Foreach ta...

Data Engineering

ForEach

Workflows

3740 Views
1 replies
0 kudos

02-19-2025 6:35:15 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

8 hours ago

0 kudos

You can use Databricks Workflows' foreach task to handle running Notebook B for each file generated in Notebook A. The key is to pass each path as a parameter to Notebook B using Databricks task values and workflows features, not widgets set manually...

0 kudos

8 hours ago

by ironv • New Contributor

02-19-2025 4:19:09 PM

3668 Views
1 replies
0 kudos

using concurrent.futures for parallelization

Hi, trying to copy a table with billions of rows from an enterprise data source into my databricks table. To do this, I need to use a homegrown library which handles auth etc, runs the query and return a dataframe. I am partitioning the table using...

Data Engineering

3668 Views
1 replies
0 kudos

02-19-2025 4:19:09 PM

View Replies

Latest Reply

mark_ott
Databricks Employee

8 hours ago

0 kudos

The "SparkSession$ does not exist in the JVM" error in your scenario is almost always due to the use of multiprocessing (like ProcessPoolExecutor) with Spark. Spark contexts and sessions cannot safely be shared across processes, especially in Databri...

0 kudos

8 hours ago

by umahesb3 • New Contributor

02-04-2025 11:48:59 PM

3817 Views
1 replies
0 kudos

Facing issues databricks asset bundle, All jobs are getting Deployed into specified targets Instead

Facing issues databricks asset bundle, All jobs are getting Deployed into specified targets Instead of defined target following was files i am using resourser yaml and databricks yml file , i am using Databricks CLI v0.240.0 , i am using databricks b...

Data Engineering

3817 Views
1 replies
0 kudos

02-04-2025 11:48:59 PM

View Replies

Latest Reply

mark_ott
Databricks Employee

8 hours ago

0 kudos

The issue you’re facing—where all Databricks Asset Bundle jobs are being deployed to all targets instead of only the defined target(s)—appears to be a known limitation in how the bundle resource inclusion and target mapping works in the Databricks CL...

0 kudos

8 hours ago

by shubham_007 • Contributor III

02-18-2025 6:30:54 AM

3230 Views
1 replies
0 kudos

Dear experts, need urgent help on logic.

Dear experts,I am facing difficulty while developing pyspark automation logic on “Developing automation logic to delete/remove display() and cache() method used in scripts in multiple databricks notebooks (tasks)”.kindly advise on developing automati...

Data Engineering

3230 Views
1 replies
0 kudos

02-18-2025 6:30:54 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

8 hours ago

0 kudos

To automate the removal of display() and cache() method calls from multiple PySpark scripts in Databricks notebooks, develop a script that programmatically processes exportable notebook source files (usually in .dbc or .ipynb format) using text-based...

0 kudos

8 hours ago

by akuma643 • New Contributor II

02-18-2025 6:48:44 AM

3444 Views
2 replies
0 kudos

The authentication value "ActiveDirectoryManagedIdentity" is not valid.

Hi Team,i am trying to connect to SQL server hosted in azure vm using Entra id authentication from Databricks.("authentication", "ActiveDirectoryManagedIdentity")Below is the notebook script i am using. driver = "com.microsoft.sqlserver.jdbc.SQLServe...

Data Engineering

3444 Views
2 replies
0 kudos

02-18-2025 6:48:44 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

8 hours ago

0 kudos

You are encountering an error because the default SQL Server JDBC driver bundled with Databricks may not fully support the authentication value "ActiveDirectoryManagedIdentity"—this option requires at least version 10.2.0 of the Microsoft SQL Server ...

0 kudos

8 hours ago

1 More Replies

by ADuma • New Contributor III

02-14-2025 12:04:44 AM

3384 Views
1 replies
0 kudos

Strcutured Streaming with queue in separate storage account

Hello,we are running a structured streaming job which consumes zipped Json files that arrive in our Azure Prod storage account. We are using AutoLoader and have set up an Eventgrid Queue which we pass to the streaming job using cloudFiles.queueName. ...

Data Engineering

3384 Views
1 replies
0 kudos

02-14-2025 12:04:44 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

8 hours ago

0 kudos

You are attempting to have your Test Databricks streaming job consume files that arrive in your Prod storage, using AutoLoader and EventGrid notifications, without physically copying the data or EventGrid queue to Test. The core challenge is that Eve...

0 kudos

8 hours ago

by turagittech • Contributor

02-13-2025 2:47:57 PM

3354 Views
1 replies
0 kudos

Identify source of data in query

Hi All,I have an issue. I have several databases with the same schemas I need to source data from. Those databases are going to end up aggregated in a data warehouse. The problem is the id column in each means different things. Example: a client id i...

Data Engineering

3354 Views
1 replies
0 kudos

02-13-2025 2:47:57 PM

View Replies

Latest Reply

mark_ott
Databricks Employee

8 hours ago

0 kudos

Migrating from Data Factory to Databricks for ETL and warehousing is a solid choice, especially for flexibility and cost-effectiveness in data engineering projects. The core issue—disambiguating “id” fields that are only unique within each source dat...

0 kudos

8 hours ago

by jeremy98 • Honored Contributor

02-12-2025 5:35:34 AM

3948 Views
2 replies
0 kudos

Best practice on how to set up a medallion architecture pipelines inside DAB

Hi Community,My team and I are working on refactoring our folder repository structure. Currently, I have been placing pipelines related to the Medallion architecture inside a folder named notebook/. However, I believe they should be moved to src/ sin...

Data Engineering

3948 Views
2 replies
0 kudos

02-12-2025 5:35:34 AM

View Replies

Latest Reply

mark_ott
Databricks Employee

8 hours ago

0 kudos

Refactoring your folder structure and naming conventions for Medallion architecture pipelines is an essential step to keep code maintainable and intuitive. Based on your context, shifting these pipelines from notebook/ to src/ is a solid move, especi...

0 kudos

8 hours ago

1 More Replies

Databricks Community

Forum Posts

Databricks Free Edition - sql server connector not working-

Resolved! oracle sequence number

Resolved! Stack traces as standard error in job logs

Quota Limit Exhausted Error when Creating declarative pipeline

Not able to export maps in dashboards

Read from multiple sources in a single stream

DLT change in hive metastore destination to unity catalog

Need help with setting up ForEach task in Databricks

using concurrent.futures for parallelization

Facing issues databricks asset bundle, All jobs are getting Deployed into specified targets Instead

Dear experts, need urgent help on logic.

The authentication value "ActiveDirectoryManagedIdentity" is not valid.

Strcutured Streaming with queue in separate storage account

Identify source of data in query

Best practice on how to set up a medallion architecture pipelines inside DAB

Join Us as a Local Community Builder!

Hive Metastore End of Life

DLT Pipeline with unknown deleted source data

[Databricks Asset Bundles] Bug: driver_node_type_i...

Global Parameter at the Pipeline level in Lakeflow...

oracle sequence number