cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

jpassaro
by Visitor
  • 11 Views
  • 0 replies
  • 0 kudos

does databricks respect parallel vacuum setting?

I am trying to run VACUUM on a delta table that i know has millions of obselete files.out of the box, VACUUM runs the deletes in sequence on the driver. that is bad news for me!According to OSS delta docs, the setting spark.databricks.delta.vacuum.pa...

  • 11 Views
  • 0 replies
  • 0 kudos
Sunil_Patidar
by New Contributor
  • 36 Views
  • 1 replies
  • 0 kudos

Unable to read from or write to Snowflake Open Catalog via Databricks

I have Snowflake Iceberg tables whose metadata is stored in Snowflake Open Catalog. I am trying to read these tables from the Open Catalog and write back to the Open Catalog using Databricks.I have explored the available documentation but haven’t bee...

  • 36 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

Greetings @Sunil_Patidar ,  Databricks and Snowflake can interoperate cleanly around Iceberg today — but how you do it matters. At a high level, interoperability works because both platforms meet at Apache Iceberg and the Iceberg REST Catalog API. Wh...

  • 0 kudos
Maxrb
by Visitor
  • 38 Views
  • 3 replies
  • 1 kudos

pkgutils walk_packages stopped working in DBR 17.2

Hi,After moving from Databricks runtime 17.1 to 17.2 suddenly my pkgutils walk_packages doesn't identify any packages within my repository anymore.This is my example code:import pkgutil import os packages = pkgutil.walk_packages([os.getcwd()]) print...

  • 38 Views
  • 3 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Hmmm, I have not personally experienced this. I dug a little deeper in our internal docs and and leveraged some internal tools to put togehter another approach for you.  Please give this a try and let me know. You’re running into a subtle but very re...

  • 1 kudos
2 More Replies
969091
by New Contributor II
  • 37533 Views
  • 11 replies
  • 10 kudos

Send custom emails from databricks notebook without using third party SMTP server. Would like to utilize databricks existing smtp or databricks api.

We want to use existing databricks smtp server or if databricks api can used to send custom emails. Databricks Workflows sends email notifications on success, failure, etc. of jobs but cannot send custom emails. So we want to send custom emails to di...

  • 37533 Views
  • 11 replies
  • 10 kudos
Latest Reply
Shivaprasad
Contributor
  • 10 kudos

Did you able to get the custom email working from databricks notebook. I was trying but was not successful. let me know

  • 10 kudos
10 More Replies
alesventus
by Contributor
  • 145 Views
  • 5 replies
  • 1 kudos

Resolved! Power BI refresh job task

I have tried Databricks job task to refresh power bi dataset and I have found 2 issues.1. I set up tables in Power BI Desktop using Import mode. After deploying the model to Power BI Service, I was able to download it as an Import mode model. However...

alesventus_0-1765874332890.png alesventus_1-1765874393964.png alesventus_3-1765874486812.png
  • 145 Views
  • 5 replies
  • 1 kudos
Latest Reply
emma_s
Databricks Employee
  • 1 kudos

Can you send a screenshot of the refresh power BI task in the jobs UI within Databricks please?  

  • 1 kudos
4 More Replies
timstrath
by Visitor
  • 28 Views
  • 1 replies
  • 1 kudos

Failed to create ingestion gateway due to no 'serverless compute'

Failed to create ingestion gatewayPipelines targeting catalogs using Default Storage must use serverless compute. If you don't have access to serverless compute, please contact Databricks to enable this feature for your workspace.

  • 28 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @timstrath ,It seems that your catalog is backed up by default storage. In that case this error is pretty explicit. You need to use serverless compute to create lakeflow ingestion pipeline if you have catalog using default storage (IBTW  think you...

  • 1 kudos
oye
by New Contributor II
  • 44 Views
  • 2 replies
  • 0 kudos

Unavailable GPU compute

Hello,I would like to create a ML compute with GPU. I am on GCP europe-west1 and the only available options for me are the G2 family and one instance of the A3 family (a3-highgpu-8g [H100]). I have been trying multiple times at different times but I ...

  • 44 Views
  • 2 replies
  • 0 kudos
Latest Reply
SP_6721
Honored Contributor II
  • 0 kudos

Hi @oye ,You’re hitting a cloud capacity issue, not a Databricks configuration problem. The Databricks GCP GPU docs list A2 and G2 as the supported GPU instance families. A3/H100 is not in the supported list: https://docs.databricks.com/gcp/en/comput...

  • 0 kudos
1 More Replies
seefoods
by Valued Contributor
  • 53 Views
  • 0 replies
  • 0 kudos

spark conf for serveless jobs

Hello Guys, I use serveless on databricks Azure, so i have build a decorator which instanciate a SparkSession. My job use autolaoder / kafka using mode availableNow. Someone Knows which spark conf is required beacause i want to add it  ? Thanx import...

  • 53 Views
  • 0 replies
  • 0 kudos
seefoods
by Valued Contributor
  • 197 Views
  • 2 replies
  • 2 kudos

Resolved! setup databricks connect on VsCode and PyCharm

Hello Guyz,Someone Know what's is the best pratices to setup databricks connect for Pycharm and VsCode using Docker, Justfile and .env file Cordially, Seefoods

  • 197 Views
  • 2 replies
  • 2 kudos
Latest Reply
Gecofer
Contributor II
  • 2 kudos

Hi @seefoods!I’ve worked with Databricks Connect and VSCode in different projects, and although your question mentions Docker, Justfile and .env, the “best practices” really depend on what you’re trying to do. Here’s what has worked best for me:1.- D...

  • 2 kudos
1 More Replies
Joost1024
by New Contributor
  • 110 Views
  • 3 replies
  • 0 kudos

Read Array of Arrays of Objects JSON file using Spark

Hi Databricks Community! This is my first post in this forum, so I hope you can forgive me if it's not according to the forum best practices After lots of searching, I decided to share the peculiar issue I'm running into in this community.I try to lo...

  • 110 Views
  • 3 replies
  • 0 kudos
Latest Reply
Joost1024
New Contributor
  • 0 kudos

I guess I was a bit over enthusiastic by accepting the answer.When I run the following on the single object array of arrays (as shown in the original post) I get a single row with column "value" and value null. from pyspark.sql import functions as F,...

  • 0 kudos
2 More Replies
rc10000
by New Contributor
  • 96 Views
  • 2 replies
  • 3 kudos

Resolved! Data Bricks Engineer - DEA Exam vs Training

Hi, I love the Databricks resources but I'm a little confused on what training to take. My focus is studying and practicing for the Databricks Engineer Associate exam, but when I hear of the 'training', I'm not sure which training people are referrin...

  • 96 Views
  • 2 replies
  • 3 kudos
Latest Reply
Advika
Community Manager
  • 3 kudos

Hello @rc10000!+1 to what @Louis_Frolio  mentioned above.The Learning Plan is designed for users preparing for the Databricks Certified Data Engineer Associate and Professional exams. Also below are a few paths, depending on what you’re looking for: ...

  • 3 kudos
1 More Replies
rc10000
by New Contributor
  • 86 Views
  • 1 replies
  • 1 kudos

Resolved! Lakeflow Connect - Databricks Data Engineer Associate Exam Post-July 2025

Hi, I'm asking another Databricks Data Engineer Associate Exam Dec 2025 question. For those who have taken the DEA exam, is Lakeflow Connect a relevant topic for the test? Been a little confused on what resource to rely on besides the official study ...

  • 86 Views
  • 1 replies
  • 1 kudos
Latest Reply
SP_6721
Honored Contributor II
  • 1 kudos

Hi @rc10000,Lakeflow Connect is mentioned in the exam guide under training, but it’s more about the ingestion concepts. These topics come under the Development & Ingestion section. I’d suggest following the official exam guide first and Databricks Ac...

  • 1 kudos
Richard3
by New Contributor II
  • 363 Views
  • 6 replies
  • 5 kudos

IDENTIFIER in SQL Views not supported?

Dear community,We are phasing out the dollar param `${catalog_name}` because it has been deprecated since runtime 15.2.We use this parameter in many queries and should now be replaced by the IDENTIFIER clause.In the query below where we retrieve data...

Richard3_0-1765199283388.png Richard3_1-1765199860462.png
  • 363 Views
  • 6 replies
  • 5 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 5 kudos

I have good news: in runtime 18, IDENTIFIER and parameter markers are supported everywhere! We need to wait a month or two as the SQL warehouse and serverless are still on runtime 17.

  • 5 kudos
5 More Replies
RobFer1985
by New Contributor
  • 173 Views
  • 2 replies
  • 0 kudos

Databricks pipeline fails expectation on execute python script, throws error: Update FAILES

Hi Community,I'm new to Databricks and am trying to make and implement pipeline expectations, The pipelines work without errors and my job works. I've tried multiple ways to implement expectations, sql and python. I keep resolving the errors but end ...

  • 173 Views
  • 2 replies
  • 0 kudos
Latest Reply
carlo968rojer
New Contributor
  • 0 kudos

Hello, @RobFer1985 The primary cause of your error is a circular reference in your logic: you are defining a table named orders_2 while simultaneously trying to readStream from that same table. In Delta Live Tables (DLT), the function acts as the "wr...

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels