cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

crami
by New Contributor II
  • 15 Views
  • 2 replies
  • 0 kudos

Quota Limit Exhausted Error when Creating declarative pipeline

I am trying to develop a declarative pipeline. As per platform policy, I cannot use serverless, reason, I am using asset bundle to create declarative pipeline. In the bundle, I am trying to specify compute for the pipeline. However, I am constantly f...

crami_1-1761925275134.png crami_0-1761925248664.png crami_2-1761925397717.png
  • 15 Views
  • 2 replies
  • 0 kudos
Latest Reply
Khaja_Zaffer
Contributor III
  • 0 kudos

Hello @crami Good day!!As the error tells. you need to increase the VM size, i know you have enough things in your place but spot fallback + Photon + autoscale triggers the failure.   Go to Azure Portal → Subscriptions → Usage + quotasFilter: Provide...

  • 0 kudos
1 More Replies
VikasSinha
by New Contributor
  • 6227 Views
  • 3 replies
  • 0 kudos

Which is better - Azure Databricks or GCP Databricks?

Which cloud hosting environment is best to use for Databricks? My question pins down to the fact that there must be some difference between the latency, throughput, result consistency & reproducibility between different cloud hosting environments of ...

  • 6227 Views
  • 3 replies
  • 0 kudos
Latest Reply
helen2
Visitor
  • 0 kudos

The main Databricks experience is essentially the same on both Azure and GCP. The difference is in the cloud infrastructure that supports them.Azure Databricks is a bit more integrated with Azure services like Azure Data Lake, Synapse Analytics, and ...

  • 0 kudos
2 More Replies
Nidhig
by Contributor
  • 302 Views
  • 2 replies
  • 1 kudos

Resolved! Conversational Agent App integration with genie in Databricks

Hi,I have recently explore the feature of conversational agent app from marketplace integration with Genie Space.The connection setup went well but I could find sync issue between the app and genie space. Even after multiple deployment I couldn't see...

  • 302 Views
  • 2 replies
  • 1 kudos
Latest Reply
HariSankar
Contributor III
  • 1 kudos

Hi @Nidhig,This isn’t expected behavior,it usually happens when the app's service principal lacks permissions to access the SQL warehouse, Genie Space, or underlying Unity Catalog tables.Try these fixes:--> SQL Warehouse: Go to Compute -> SQL Warehou...

  • 1 kudos
1 More Replies
Dhruv-22
by Contributor
  • 27 Views
  • 1 replies
  • 0 kudos

Reading empty json file in serverless gives error

I have a pipeline which puts json files in a storage location after reading a daily delta load. Today I encountered a case where the file as empty. I tried running the notebook manually using serverless cluster (Environment version 4) and encountered...

  • 27 Views
  • 1 replies
  • 0 kudos
Latest Reply
K_Anudeep
Databricks Employee
  • 0 kudos

Solution provided here:  https://community.databricks.com/t5/data-engineering/reading-empty-json-file-in-serverless-gives-error/m-p/137022#M50682

  • 0 kudos
dipanjannet
by New Contributor II
  • 2715 Views
  • 3 replies
  • 0 kudos

Anyone using Databricks Query Federation for ETL purpose ?

Hello All,We have a use case to fetch data from a SQL Server wherein we have some tables to consume. This is typically a OLTP setup wherein the comes in a regular interval.  Now, as we have Unity Catalog enabled, we are interested in exploring Databr...

  • 2715 Views
  • 3 replies
  • 0 kudos
Latest Reply
dipanjannet
New Contributor II
  • 0 kudos

Hello @nikhilj0421 - Thank you for help responding. The question is not about DLT. The Question is what is the use case of Databricks Query Federation? If we plug Query Federation - what are the implications ? What databricks is suggesting for that?

  • 0 kudos
2 More Replies
RakeshRakesh_De
by New Contributor III
  • 2291 Views
  • 3 replies
  • 1 kudos

Databricks Free Edition - sql server connector not working-

I am trying to explore New Databricks Free edition but SQL Server connector Ingestion pipeline not able to set up through UI.. Its showing error that --Serverless Compute Must be Enabled for the workspace,But Free Edition only have Serverless Option ...

Data Engineering
FreeEdition
LakeFlow
  • 2291 Views
  • 3 replies
  • 1 kudos
Latest Reply
Saf4Databricks
New Contributor III
  • 1 kudos

Hi @RakeshRakesh_De  The error is misleading. As mentioned in the second row of the table here the gateway runs on classic compute, and the ingestion pipeline runs on serverless compute (mentioned in the third row of the same table linked above). Hop...

  • 1 kudos
2 More Replies
austinoyoung
by New Contributor III
  • 34 Views
  • 2 replies
  • 1 kudos

Resolved! oracle sequence number

Dear All,I am trying to use jdbc driver to connect to an oracle database and append a new record to a table. The table has a column needs to be populated with a sequence number. I've been trying to use select `<sequence_name>.nextval` to get the sequ...

  • 34 Views
  • 2 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Hey @austinoyoung ,  Short answer: Don’t try to pull the sequence in your Spark insert. Let Oracle assign it. Why this happens (ORA-02287: sequence number not allowed here Spark’s JDBC writer generates parameterized INSERT statements like: INSERT INT...

  • 1 kudos
1 More Replies
bidek56
by New Contributor III
  • 16 Views
  • 1 replies
  • 0 kudos

Resolved! Stack traces as standard error in job logs

When using DBR 16.4, I am seeing a lot of Stack traces as standard error in jobs, any idea why they are showing up and how to turn then off? Thx"FlagSettingCacheMetricsTimer" id=18 state=WAITING- waiting on <0x2d1573c6> (a java.util.TaskQueue)- locke...

  • 16 Views
  • 1 replies
  • 0 kudos
Latest Reply
bidek56
New Contributor III
  • 0 kudos

spark.databricks.driver.disableJvmThreadDump=trueThis setting will remove the ST. 

  • 0 kudos
saab123
by New Contributor II
  • 3223 Views
  • 1 replies
  • 0 kudos

Not able to export maps in dashboards

When we export a dashboard with maps, the map background doesn't show up in the pdf. 

  • 3223 Views
  • 1 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

When exporting a Databricks dashboard with maps to PDF, it is a known issue that the map background sometimes does not appear in the exported PDF file. This problem has been discussed in the Databricks community as of early 2025, and appears to be a ...

  • 0 kudos
SrihariB
by New Contributor
  • 3661 Views
  • 1 replies
  • 0 kudos

Read from multiple sources in a single stream

Hey all, I am trying to read data from multiple s3 locations using a single stream DLT pipeline and loading data into a single target. Here is the scenario. S3 Locations: Below are my s3 raw locations with change in the directory names at the end. Ba...

  • 3661 Views
  • 1 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

You are using Databricks Autoloader (cloudFiles) within a Delta Live Tables (DLT) pipeline to ingest streaming Parquet data from multiple S3 directories with a wildcard pattern, and you want to ensure all matching directories’ data is included in a s...

  • 0 kudos
Mauro
by New Contributor II
  • 3217 Views
  • 1 replies
  • 0 kudos

DLT change in hive metastore destination to unity catalog

A change recently came out in which Databricks necessarily requires using the Unity Catalog as the output of a DLT because previously it was HiveMetaStore. At first I was working using CDC plus expectations which resulted in the "allow_expectations_c...

  • 3217 Views
  • 1 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

Databricks has recently enforced Unity Catalog as the output target for Delta Live Tables (DLT), replacing the legacy Hive Metastore approach. As a result, the familiar "allow_expectations_col" column, which was automatically added to help track and ...

  • 0 kudos
Yuppp
by New Contributor
  • 3740 Views
  • 1 replies
  • 0 kudos

Need help with setting up ForEach task in Databricks

Hi everyone,I have a workflow involving two notebooks: Notebook A and Notebook B. At the end of Notebook A, we generate a variable number of files, let's call it N. I want to run Notebook B for each of these N files.I know Databricks has a Foreach ta...

For Each.jpg Task.jpg
Data Engineering
ForEach
Workflows
  • 3740 Views
  • 1 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

You can use Databricks Workflows' foreach task to handle running Notebook B for each file generated in Notebook A. The key is to pass each path as a parameter to Notebook B using Databricks task values and workflows features, not widgets set manually...

  • 0 kudos
ironv
by New Contributor
  • 3668 Views
  • 1 replies
  • 0 kudos

using concurrent.futures for parallelization

Hi, trying to copy a table with billions of rows from an enterprise data source into my databricks table.  To do this, I need to use a homegrown library which handles auth etc, runs the query and return a dataframe.  I am partitioning the table using...

  • 3668 Views
  • 1 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

The "SparkSession$ does not exist in the JVM" error in your scenario is almost always due to the use of multiprocessing (like ProcessPoolExecutor) with Spark. Spark contexts and sessions cannot safely be shared across processes, especially in Databri...

  • 0 kudos
umahesb3
by New Contributor
  • 3818 Views
  • 1 replies
  • 0 kudos

Facing issues databricks asset bundle, All jobs are getting Deployed into specified targets Instead

Facing issues databricks asset bundle, All jobs are getting Deployed into specified targets Instead of defined target following was files i am using resourser yaml and databricks yml file , i am using Databricks CLI v0.240.0 , i am using databricks b...

  • 3818 Views
  • 1 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

The issue you’re facing—where all Databricks Asset Bundle jobs are being deployed to all targets instead of only the defined target(s)—appears to be a known limitation in how the bundle resource inclusion and target mapping works in the Databricks CL...

  • 0 kudos
shubham_007
by Contributor III
  • 3230 Views
  • 1 replies
  • 0 kudos

Dear experts, need urgent help on logic.

Dear experts,I am facing difficulty while developing pyspark automation logic on “Developing automation logic to delete/remove display() and cache() method used in scripts in multiple databricks notebooks (tasks)”.kindly advise on developing automati...

  • 3230 Views
  • 1 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

To automate the removal of display() and cache() method calls from multiple PySpark scripts in Databricks notebooks, develop a script that programmatically processes exportable notebook source files (usually in .dbc or .ipynb format) using text-based...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels