cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

prakashhinduja
by New Contributor III
  • 1789 Views
  • 1 replies
  • 1 kudos

Resolved! Prakash Hinduja Switzerland (Swiss) How do I implement data quality checks in Databricks workflows?

Hi everyone,I am Prakash Hinduja from Geneva, Switzerland (Swiss), working on enhancing our data engineering processes. I’m looking to implement data quality checks within our Databricks workflows.RegardsPrakash Hinduja Geneva, Switzerland (Swiss) 

  • 1789 Views
  • 1 replies
  • 1 kudos
Latest Reply
Pat
Esteemed Contributor
  • 1 kudos

Hi @prakashhinduja ,I have something that might inspire you, as I am currently looking into it.DQX:https://databrickslabs.github.io/dqx/docs/demos/https://www.youtube.com/watch?v=e5Qvx_gnxTEthanks,Pat.

  • 1 kudos
meanwhilefurthe
by New Contributor III
  • 5315 Views
  • 10 replies
  • 0 kudos

orphan queries in running state

We have a job submitted through the Spark Connect API, and running on Serverless Compute.The job got canceled twice and left a total of 14 queries orphan, they are in a weird state because the running time is not increasing, but they are there showin...

  • 5315 Views
  • 10 replies
  • 0 kudos
Latest Reply
Khaja_Zaffer
Esteemed Contributor
  • 0 kudos

ALSO, did you make any recent code changes or network changes?

  • 0 kudos
9 More Replies
Dimitry
by Valued Contributor
  • 5222 Views
  • 13 replies
  • 2 kudos

Problem activating File Events for External Location / ADLS V2

Hi allI've followed the book for creating external location for Azure Data Lake Storage (ADLS V2) using account connectorI've granted all required permissions to the connector:I've created a "stock" container on that above mentioned "devtyremeshare" ...

Dimitry_0-1752217945550.png Dimitry_1-1752218028803.png Dimitry_4-1752218304084.png Dimitry_2-1752218084368.png
  • 5222 Views
  • 13 replies
  • 2 kudos
Latest Reply
Khaja_Zaffer
Esteemed Contributor
  • 2 kudos

Hello @Dimitry No subject | Meeting-Join | Microsoft Teams8:30 AM UK TIME which is 17:30 Sydney time. ( i woke up just now)

  • 2 kudos
12 More Replies
Anwar_Patel
by New Contributor III
  • 4775 Views
  • 5 replies
  • 0 kudos

Resolved! Not received my certificate after passing Databricks Certified Associate Developer for Apache Spark 3.0 - Python.

I've successfully passed Databricks Certified Associate Developer for Apache Spark 3.0 - Python but still have not received the certificate. E-mail : anwarpatel91@gmail.com

  • 4775 Views
  • 5 replies
  • 0 kudos
Latest Reply
simha6_reddy
New Contributor II
  • 0 kudos

Even i am facing the same issue. I have successfully passed Databricks Certified Associate Developer for Apache Spark - Python but still have not received the certificate. E-mail : simha6.reddy@gmail.com

  • 0 kudos
4 More Replies
jar
by Contributor
  • 3213 Views
  • 3 replies
  • 2 kudos

Resolved! Databricks serverless SQL cost-effective.. when?

Hey.I read at one-point that there would be a cheaper, less-performant version of the Databricks serverless SQL DW but can't find any information on this. Was I dreaming or will it soon be possible with a more cost-effective option for less-intensive...

  • 3213 Views
  • 3 replies
  • 2 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 2 kudos

Hi @jar ,It’s possible you’re referring to the performance mode for serverless compute. However, at the moment it doesn’t apply to DB SQL Warehouse:The first mode, which is in General Availability, is called performance-optimized mode. This mode is d...

  • 2 kudos
2 More Replies
syamsubrahmanya
by Databricks Partner
  • 1378 Views
  • 1 replies
  • 0 kudos

Can I connect Databricks directly to Salesforce CRM for live data access?

Hi everyone,I'm currently working on integrating Databricks with Salesforce CRM. I want to know if it's possible to connect Databricks directly to Salesforce CRM to access live (real-time or near real-time) data—not just periodic batch exports.Specif...

  • 1378 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @syamsubrahmanya ,Yes, you can use Lakehouse Federation to query Salesforce data from DatabricksIntroducing Salesforce Connectors for Lakehouse Federation and LakeFlow Connect | Databricks BlogSet Up a Databricks Data Federation Connection | Data ...

  • 0 kudos
Anusha5
by New Contributor
  • 1929 Views
  • 2 replies
  • 0 kudos

Resolved! Getting start with Databricks

Hi All, I wanted to learn Databricks as an ETL background. Please help me with the roadmap and some qualified certifications that matters most. Thank you.

  • 1929 Views
  • 2 replies
  • 0 kudos
Latest Reply
MariuszK
Valued Contributor III
  • 0 kudos

I'll recommend my tutorials:Introduction to Databricks: A Beginner’s Guidehttps://medium.com/@mariusz_kujawski/getting-started-with-databricks-a-beginners-guide-8b8db7f6f457Why I Liked Delta Live Tables in Databrickshttps://medium.com/@mariusz_kujaws...

  • 0 kudos
1 More Replies
bcodernet
by New Contributor II
  • 3405 Views
  • 3 replies
  • 1 kudos

Databricks Apps with Pyodbc Microsoft SQL Driver

I'm building an app that interfaces with an Azure SQL Database. I need to use Entra auth with a service principal, which is why I'm using the Microsoft ODBC driver. However, this works fine on my local, but I can't figure out how to get the ODBC driv...

  • 3405 Views
  • 3 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @bcodernet ,Each Databricks app can include dependencies for Python, Node.js, or both. You define these dependencies in language-specific files:Use a requirements.txt file to specify additional Python packages.Use a package.json file to specify No...

  • 1 kudos
2 More Replies
yazz
by Databricks Partner
  • 2115 Views
  • 2 replies
  • 0 kudos

Converting Existing Streaming Job to Delta Live Tables with Historical Backfill

Description:I’m migrating a two-stage streaming job into Delta Live Tables (DLT):Bronze: read from Pub/Sub → write to Bronze tableSilver: use create_auto_cdc_flow on Bronze → upsert into Silver tableNew data works perfectly, but I now need to backfil...

  • 2115 Views
  • 2 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @yazz ,I’m wondering if you could use a similar approach to the one in the article below.  So, just backfill your bronze table first. Then, the downstream silver and gold layers will pick up the new data from the bronze layer.In that approach you ...

  • 0 kudos
1 More Replies
pt16
by Databricks Partner
  • 1791 Views
  • 3 replies
  • 0 kudos

Enable automatic identity management in Azure Databricks

We have Databricks account admin access but not able to see the option from Databricks admin console to enable automatic identity management.Using the Previews page wanted to enable and fallowed below steps:1. As an account admin, log in to the accou...

  • 1791 Views
  • 3 replies
  • 0 kudos
Latest Reply
pt16
Databricks Partner
  • 0 kudos

After raising Databrick ticket, today I am able to see the Automatic Identity Management  public preview option 

  • 0 kudos
2 More Replies
seefoods
by Valued Contributor
  • 1720 Views
  • 1 replies
  • 1 kudos

process mongo table to delta table databricks

Hello Guys,I have a table mongo which size is 67GB, I use streaming to ingest but is very slow to copying all data to Delta table. Someone have an answer to this?  I use connector mongodb V10.5 this is my code pipeline_mongo_sec = [ { "$u...

  • 1720 Views
  • 1 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

What if you do not update the delta table for each incoming microbatch but f.e. only do this every 15 min/hour/whatever.Like that you can keep on ingesting in a streaming way, but the actual update towards the delta table is more batch approached so ...

  • 1 kudos
mr3
by New Contributor
  • 3336 Views
  • 2 replies
  • 2 kudos

Update Delta Table with Apache Spark connector

Hi everyone. I'd like to ask a question about updating Delta tables using the Apache Spark connector.Let's say I have two tables: one is a product dimension table with items from my shop, and the other contains a single column with the IDs of the pro...

  • 3336 Views
  • 2 replies
  • 2 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 2 kudos

Hi @mr3 ,Yes, it’s perfectly fine to use a MERGE operation solely for updates. The UPDATE statement has many limitations. It doesn't support neither UPDATE FROM nor subqueries. This creates many limitations. There are situations where we would like t...

  • 2 kudos
1 More Replies
shrutikatyal
by New Contributor III
  • 5838 Views
  • 9 replies
  • 2 kudos

Resolved! commit time is coming as null in autoloader

As per the databricks new feature in autoloader that we can use archival and move feature in autoloader however I am trying to use that feature using databricks 16.4.x.scala2.12 however commit time is still coming null as its mentioned in the documen...

  • 5838 Views
  • 9 replies
  • 2 kudos
Latest Reply
TheOC
Databricks Partner
  • 2 kudos

Hey @shrutikatyal I believe the only current route to get a discount voucher would be the following:https://community.databricks.com/t5/events/dais-2025-virtual-learning-festival-11-june-02-july-2025/ev-p/119323I think it’s the last day of the event ...

  • 2 kudos
8 More Replies
MinuN
by New Contributor
  • 3962 Views
  • 1 replies
  • 0 kudos

Handling Merged Heading Rows When Converting Excel to CSV in Databricks

Hi all,I'm working on a process in Databricks to convert multiple Excel files to CSV format. These Excel files follow a similar structure but with some variations. Here's the situation:Each file contains two header rows:The first row contains merged ...

  • 3962 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 0 kudos

Hi MinuN,How are you doing today?That’s a great question, and you're definitely on the right path using BeautifulSoup to extract the table structure from .xls HTML-like files. To generate the repeated first row of main headings for the CSV, one pract...

  • 0 kudos
sridharplv
by Valued Contributor II
  • 2067 Views
  • 1 replies
  • 1 kudos

Need help on "You cannot enable Iceberg reads on materialized views and streaming tables"

Hi All, As we  "cannot enable Iceberg reads on materialized views and streaming tables", Is there any option in private preview to enable Iceberg reads for materialized views and streaming tables. I tried using the option of DLT Sink API with table c...

  • 2067 Views
  • 1 replies
  • 1 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 1 kudos

Hi sridharplv,How are you doing today? As per my understanding, Databricks does not support Iceberg reads for materialized views and streaming tables, and there’s no official preview or timeline shared publicly for enabling this support. Your workaro...

  • 1 kudos
Labels