cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

bianca_unifeye
by Databricks MVP
  • 559 Views
  • 0 replies
  • 1 kudos

Webinar: Geospatial Data Ingestion and Manipulation on Databricks

Geospatial Data Meets Databricks + Felt: Turning Coordinates into Business InsightMost organisations capture huge volumes of spatial data — addresses, coordinates, routes, catchments — but struggle to operationalise it at scale. Traditional GIS tool...

Data Engineering
geospatial
webinar
  • 559 Views
  • 0 replies
  • 1 kudos
Abdul_Alikhan
by New Contributor II
  • 3053 Views
  • 5 replies
  • 3 kudos

Resolved! in data bricks free edition Serverless compute is not working

I recently logged into the Databricks free edition, but the serverless compute is not working. I'm receiving the error: 'An error occurred while trying to attach serverless compute. Please try again or contact support.'"

  • 3053 Views
  • 5 replies
  • 3 kudos
Latest Reply
LonaOsmani
New Contributor III
  • 3 kudos

Hi @Abdul_Alikhan ,I experienced the same yesterday when I imported some of my notebooks. I noticed that this error only appeared for imported notebooks because the environment version was 1 by default. Changing the environment version to 2 solved th...

  • 3 kudos
4 More Replies
FarhanM
by New Contributor II
  • 1759 Views
  • 1 replies
  • 1 kudos

Resolved! Databricks Streaming: Recommended Cluster Types and Best Practices

Hi Community, I recently built some streaming pipelines (Autoloader-based) that extract JSON data from the Data Lake and, after parsing and logging, dump it into the Delta Lake bronze layer. Since these are streaming pipelines, they are supposed to r...

  • 1759 Views
  • 1 replies
  • 1 kudos
Latest Reply
bianca_unifeye
Databricks MVP
  • 1 kudos

When running streaming pipelines, the key is to design for stability and isolation, not to rely on restart jobs.The first thing to do is run your streams on Jobs Compute, not All-Purpose clusters. If available, use Serverless Jobs. Each pipeline shou...

  • 1 kudos
brickster_2018
by Databricks Employee
  • 3042 Views
  • 2 replies
  • 0 kudos

Resolved! I do not have any Spark jobs running, but my cluster is not getting auto-terminated.

The cluster is Idle and there are no Spark jobs running on the Spark UI. Still I see my cluster is active and not getting terminated.

  • 3042 Views
  • 2 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

Databricks cluster is treated as active if there are any spark or non-Spark operations running on the cluster. Even though there are no Spark jobs running on the cluster, it's possible to have some driver-specific application code running marking th...

  • 0 kudos
1 More Replies
fundat
by New Contributor III
  • 733 Views
  • 2 replies
  • 2 kudos

Resolved! Course - Introduction to Apache Spark

Hi,In the course Introduction to Apache Spark; according to Apache Spark Runtime Architecture; Page 6 of 15. It says that :The cluster manager allocates resources and assigns tasks......Workers perform tasks assigned by the driverCan you help me plea...

fundat_3-1761596488970.png
  • 733 Views
  • 2 replies
  • 2 kudos
Latest Reply
BS_THE_ANALYST
Databricks Partner
  • 2 kudos

Hi @fundat Perhaps the picture is useful here:Give this blog a read, I think this will answer some of your questions: https://medium.com/@knoldus/understanding-the-working-of-spark-driver-and-executor-4fec0e669399 .All the best,BS

  • 2 kudos
1 More Replies
jigar191089
by New Contributor III
  • 8213 Views
  • 12 replies
  • 0 kudos

Multiple concurrent jobs using interactive cluster

Hi All,I have notebook in Databricks. This notebook is executed from azure datafactory pipeline having a databricks notebook activity with linkedservice connected to an interactive cluster.When multiple concurrent runs of this pipeline are created, I...

Data Engineering
azure
Databricks
interactive cluster
  • 8213 Views
  • 12 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

Greetings @jigar191089 , I did some digging and here are some ideas to think about.   This smells like a shared-state/import-path issue on an interactive cluster under concurrency.   What likely happened Your notebook imports Python modules from /dbf...

  • 0 kudos
11 More Replies
mkwparth
by Databricks Partner
  • 1574 Views
  • 2 replies
  • 1 kudos

Resolved! DLT | Communication lost with driver | Cluster was not reachable for 120 seconds

Hey Community, I'm facing this error, It says that "com.databricks.pipelines.common.errors.deployment.DeploymentException: Communication lost with driver. Cluster 1030-205818-yu28ft9s was not reachable for 120 seconds" This issue occurred in producti...

mkwparth_0-1761892686441.png
  • 1574 Views
  • 2 replies
  • 1 kudos
Latest Reply
nayan_wylde
Esteemed Contributor II
  • 1 kudos

This is actually a known intermittent issue in Databricks, particularly with streaming or Delta Live Tables (DLT) pipelines.This isn’t a logical failure in your code — it’s an infrastructure-level timeout between the Databricks control plane and the ...

  • 1 kudos
1 More Replies
CaptainJack
by New Contributor III
  • 825 Views
  • 1 replies
  • 0 kudos

Pull workspace url and workspace name using databricks-sdk / programaticaly in notebook

1. How could I pull workspace url (https://adb-XXXXX.XX.....net) 2. How could I get workspace name visible in top right corner.I know that easies solution is dbutils.notebook.entry_point.... browserHostName but unfortunetly it is not working in job c...

  • 825 Views
  • 1 replies
  • 0 kudos
Latest Reply
AbhaySingh
Databricks Employee
  • 0 kudos

Can you give this a shot? Not sure if you've a hard requirement of using SDK.  workspace_url = spark.conf.get('spark.databricks.workspaceUrl') Getting name is more tricky. You could potentially get it from tags if there is a tagging strategy in place...

  • 0 kudos
deano2025
by New Contributor II
  • 3305 Views
  • 1 replies
  • 1 kudos

Databricks asset bundles CI/CD design for github actions

We are wanting to use Databricks asset bundles and deploy code changes and tests using github actions. We have seen lots of content online, but nothing concrete on how this is done at scale. So I'm wondering, if we have many changes and therefore man...

Data Engineering
asset bundles
  • 3305 Views
  • 1 replies
  • 1 kudos
Latest Reply
AbhaySingh
Databricks Employee
  • 1 kudos

Have you read about following approach before?    Repository Structure Options     1. Monorepo with Multiple Bundles     repo-root/   ├── .github/   │   └── workflows/   │       ├── bundle-ci.yml   │       └── bundle-deploy.yml   ├── bundles/   │   ├...

  • 1 kudos
JanFalta
by New Contributor
  • 930 Views
  • 1 replies
  • 0 kudos

Data Masking

Hi all,I need some help on this masking problem. If you create a view with used masking function based on table.The user reading this view has to have read access to underlying table. So theoretically, he can access unmasked data in the table.I would...

  • 930 Views
  • 1 replies
  • 0 kudos
Latest Reply
AbhaySingh
Databricks Employee
  • 0 kudos

Are you on Unity catalog?  Databricks has a solution for this through Unity Catalog Column Masking (also called Dynamic Views or Column-Level Security). https://learn.microsoft.com/en-us/azure/databricks/data-governance/unity-catalog/filters-and-mask...

  • 0 kudos
bhawana-pandey
by Databricks Partner
  • 816 Views
  • 1 replies
  • 0 kudos

Looking for reference DABs bundle yaml and resources for Databricks app deployment (FastAPI redirect

Looking for example databricks.yml and bundle resources for deploying a FastAPI Databricks app using DABs from one environment to another. Deployment works but FastAPI redirects to localhost after deployment, though the homepage loads fine. Need refe...

  • 816 Views
  • 1 replies
  • 0 kudos
Latest Reply
AbhaySingh
Databricks Employee
  • 0 kudos

This is a great place to start: https://apps-cookbook.dev/resources/ Happy to answer specifics as they come after you've reviewed that resource. 

  • 0 kudos
kfoster
by Databricks Partner
  • 8167 Views
  • 8 replies
  • 7 kudos

Azure DevOps Repo - Invalid Git Credentials

I have a Repo in Databricks connected to Azure DevOps Repositories.The repo has been working fine for almost a month, until last week. Now when I try to open the Git settings in Databricks, I am getting "Invalid Git Credentials". Nothing has change...

  • 8167 Views
  • 8 replies
  • 7 kudos
Latest Reply
klaas
New Contributor II
  • 7 kudos

I had a similar problem. I could fix following these steps:in the Azure Devops repository: User Settings -> Personal access tokens  -> + New tokenin Databricks: Settings -> User -> Linked accounts -> Azure Devops (Personal access token)You could also...

  • 7 kudos
7 More Replies
whatever
by New Contributor
  • 1747 Views
  • 1 replies
  • 0 kudos

broken file API and inconsistent behavior

Since there is no way to file a bug, I'll post it here.. Honestly, I haven't seen such a broken and inconsistent API from production system yet in my life..what is worse - this same issue is in 'os' module:And their UI (despite actually showing the f...

whatever_0-1753367689463.png whatever_0-1753368641377.png whatever_1-1753368764667.png
  • 1747 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @whatever,  Thanks for sharing this. I will test this and report internally, meanwhile you can also submit a new idea/request/bug using this portal from your end: https://docs.databricks.com/en/resources/ideas.html#create-an-idea-in-the-ideas-port...

  • 0 kudos
Rainier_dw
by Databricks Partner
  • 4450 Views
  • 2 replies
  • 0 kudos

, Help Needed: Obtaining and Applying Blade Bridge License for SSIS-to-DB SQL Conversion

Hello everyone,I’m in the process of using Blade Bridge to convert my SSIS .dtsx packages into Databricks SQL, but I’ve run into a licensing issue and could use some guidance.What I’m doing:Installed Blade Bridge and followed the required folder stru...

  • 4450 Views
  • 2 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @Rainier_dw @Eric_Kieft ,  https://github.com/databrickslabs/lakebridge/issues/1819 is now tracked under https://github.com/databrickslabs/lakebridge/issues/1836 as an enhancement for product and https://github.com/databrickslabs/lakebridge/pull/1...

  • 0 kudos
1 More Replies
rajanchaturvedi
by New Contributor
  • 3197 Views
  • 2 replies
  • 0 kudos

Executors getting killed while Scaling Spark jobs on GPU using RAPIDS(NVIDIA)

Hi Team , I want to take advantage of Spark Distribution over GPU clusters using RAPID(NVIDIA) , everything is setup 1. The Jar is loaded correctly via Init script , the jar is downloaded and uploaded on volume (workspace is unity enabled) and via In...

rajanchaturvedi_0-1750067083816.png rajanchaturvedi_1-1750067171780.png rajanchaturvedi_2-1750067287042.png
  • 3197 Views
  • 2 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Also try to gradually reduce spark.executor.memory You need to allocate less memory to the JVM heap because the GPU needs a large chunk of the node's off-heap (system) memory. The GPU memory is allocated outside the JVM heap. If the heap is too large...

  • 0 kudos
1 More Replies
Labels