cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

RevanthV
by New Contributor III
  • 27 Views
  • 1 replies
  • 0 kudos

POC on spark 4.x

I need to do some POC with spark 3.5.7 and 4.x and need some local setup with some sample Kafka source. The POC would read data from Kafka via streaming job and write to delta table and I would like to do this on spark-4.x ..Do you know of any quick ...

  • 27 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

Hello @RevanthV , I did some digging and here are some helpful tips. Got it — here are fast, reproducible ways to stand up a local Kafka source and run a Spark Structured Streaming job that writes to a Delta table, plus the common fixes for the conne...

  • 0 kudos
africke
by New Contributor
  • 99 Views
  • 3 replies
  • 2 kudos

Resolved! Cannot view nested MLflow experiment runs without changing URL

Hello,I've recently been testing out Databricks experiments for a project of mine. I wanted to nest runs, and then see these runs grouped by their parent in the experiments UI. For the longest time, I couldn't figure out how to do this. I was seeing ...

africke_0-1763067856720.png africke_1-1763068038451.png
  • 99 Views
  • 3 replies
  • 2 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 2 kudos

@africke , If you’re happy with the results, please go ahead and accept this as the solution so others know it worked.

  • 2 kudos
2 More Replies
kyeongmin_baek
by New Contributor II
  • 60 Views
  • 4 replies
  • 1 kudos

Resolved! Got an empty query file when cloning Query file.

Hello Community,In our AWS Databricks environment, we’ve encountered some behavior we don’t understand while performing the following operation.When we clone a query file that already has content, a new file is created with the same name and “(clone)...

  • 60 Views
  • 4 replies
  • 1 kudos
Latest Reply
Raman_Unifeye
Contributor III
  • 1 kudos

@kyeongmin_baek - There is no aut-save or CMD+S for Query as it get saved only once it is attached to a cluster AND 'SAVE' icon is used. However, it still stays in cache as unsaved in that 'query' window but cloning or other file operations may lose ...

  • 1 kudos
3 More Replies
Shivaprasad
by New Contributor III
  • 45 Views
  • 2 replies
  • 0 kudos

Error while creating databricks custom app

I am trying to create a simple databricks custom app but I am getting Error: Could not import 'app'. error.app.yaml fileenv: - name: FLASK_APP   value: '/Workspace/Users/sam@xxx.com/databricks_apps/hello-world_2025_11_13-16_19/Gaap_commentry/app'comm...

  • 45 Views
  • 2 replies
  • 0 kudos
Latest Reply
Shivaprasad
New Contributor III
  • 0 kudos

Thanks, I have modified the yaml file but still getting Error: Could not import 'app'   errorenv:  - name: FLASK_APP    value: '/Workspace/Users/xxx@zzz.com/databricks_apps/hello-world_2025_11_13-16_19/Gaap_commentry'command: [  "flask",  "--app",  "...

  • 0 kudos
1 More Replies
Escarigasco
by New Contributor II
  • 29 Views
  • 1 replies
  • 0 kudos

Usage Dashboard from Cody Austin Davis displays only DBUs or overall cost including the VM uptime?

Hello,I have been looking at the new Dashboard created by @CodyA ( great job! ) and I was wondering if the cost displaying only provides visibility to only the databricks mark-up on each job ( i.e. $ DBUs ) or to the overall cost including cloud prov...

  • 29 Views
  • 1 replies
  • 0 kudos
Latest Reply
Raman_Unifeye
Contributor III
  • 0 kudos

As the dashboard uses System Tables (system.billing.usage) to show spend across Jobs, SQL, and Notebooks, I dont beleive it includes cloud provider VM costs.

  • 0 kudos
jitendrajha11
by Visitor
  • 27 Views
  • 4 replies
  • 1 kudos

Want to see logs for lineage view run events

Hi All,I need your help, as I am running jobs it is getting successful, when I click on job and there we can find lineage > View run events option when click on it. I see below steps.Job Started: The job is triggered.Waiting for Cluster: The job wait...

  • 27 Views
  • 4 replies
  • 1 kudos
Latest Reply
jitendrajha11
  • 1 kudos

Hi Team/Member,As I am running jobs it is getting successful, when I click on job and there we can find lineage > View run events option when click on it. We find below steps and also added screenshot of it. I want screenshot stages logs, where i wil...

  • 1 kudos
3 More Replies
Sainath368
by Contributor
  • 77 Views
  • 4 replies
  • 4 kudos

Resolved! Autoloader Managed File events

Hi all,We are in the process of migrating from directory listing to managed file events in Azure Databricks. Our data is stored in an Azure Data Lake container with the following folder structure:To enable file events in Unity Catalog (UC), I created...

Sainath368_0-1763538057402.png
  • 77 Views
  • 4 replies
  • 4 kudos
Latest Reply
Raman_Unifeye
Contributor III
  • 4 kudos

Recommended approach to continue your existing pattern:Keep the External Location enabled for file events at the high-level path (/Landing).Run a separate Structured Streaming job for each table, specifying the full sub-path in the .load() function (...

  • 4 kudos
3 More Replies
smoortema
by Contributor
  • 63 Views
  • 3 replies
  • 3 kudos

how to know which join type was used (broadcast, shuffle hash or sort merge join) for a query?

What is the best way to know what kind of join was used for a SQL query between broadcast, shuffle hash and sort merge? How can the spark UI or the query plan be interpreted?

  • 63 Views
  • 3 replies
  • 3 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 3 kudos

@smoortema , Spark performance tuning is one of the hardest topics to teach or learn, and it’s even tougher to do justice to in a forum thread. That said, I’m really glad to see you asking the question. Tuning is challenging precisely because there a...

  • 3 kudos
2 More Replies
austinoyoung
by New Contributor III
  • 1329 Views
  • 7 replies
  • 4 kudos

create an external connection to oracle

Hi! I've been trying to create an external connection to oracle but getting the following error message "Detailed error message: ORA-00604: error occurred at recursive SQL level 1 ORA-01882: timezone region not found" I searched online and found some...

  • 1329 Views
  • 7 replies
  • 4 kudos
Latest Reply
TheOC
Contributor III
  • 4 kudos

hey @austinoyoung ,I don't have an Oracle database to be able to test this for you, but I believe you can get around this error by following the steps laid out in here:https://stackoverflow.com/questions/9156379/ora-01882-timezone-region-not-foundIn ...

  • 4 kudos
6 More Replies
mits1
by New Contributor
  • 26 Views
  • 2 replies
  • 1 kudos

Resolved! Unable to navigate/login to Databricks Account Console

Hi,I have deployed Azure Databricks using email id (say xx@gmail.com) and able to lauch a workspace.When I  try to access account console, it throws below errorSelected user account does not exist in tenant 'Microsoft Services' and cannot access the ...

  • 26 Views
  • 2 replies
  • 1 kudos
Latest Reply
Raman_Unifeye
Contributor III
  • 1 kudos

And old link but still relevant - https://github.com/cloudboxacademy/azure_databricks_course/blob/main/known-issues/unable-to-login-to-azure-databricks-account-console.md

  • 1 kudos
1 More Replies
mh2587
by New Contributor II
  • 3785 Views
  • 1 replies
  • 1 kudos

Managing PCI-DSS Compliance and Access to Serverless Features in Azure Databricks

Hello Databricks CommunityI am currently using Azure Databricks with PCI-DSS compliance enabled in our workspace, as maintaining stringent security standards is crucial for our organization. However, I've discovered that once PCI-DSS compliance is tu...

  • 3785 Views
  • 1 replies
  • 1 kudos
Latest Reply
mark_ott
Databricks Employee
  • 1 kudos

Once PCI-DSS compliance is enabled in Azure Databricks, the workspace is locked into a set of restrictions to maintain those standards and safeguard sensitive data. These restrictions include disabling access to features like serverless compute, whic...

  • 1 kudos
Zbyszek
by New Contributor
  • 51 Views
  • 2 replies
  • 1 kudos

Create a Hudi table with Databrick 17

Hi I'm trying to run my existing code which has worked on the older DB version.CREATE TABLE IF NOT EXISTS catalog.demo.ABTHudi USING org.apache.hudi.Spark3DefaultSource OPTIONS ('primaryKey' = 'ID','hoodie.table.name' = 'ABTHudi') AS SELECT * FROM pa...

  • 51 Views
  • 2 replies
  • 1 kudos
Latest Reply
Zbyszek
New Contributor
  • 1 kudos

Thank You for your response, I will wait for more updates on that.RegardsZiggy 

  • 1 kudos
1 More Replies
Nes_Hdr
by New Contributor III
  • 5636 Views
  • 3 replies
  • 0 kudos

Path based access not supported for tables with row filters?

Hello,  I have encountered an issue recently and was not able to find a solution yet. I have a job on databricks that creates a table using dbt (dbt-databricks>=1.0.0,<2.0.0). I am setting the location_root configuration so that this table is externa...

Data Engineering
dbt
row_filter
  • 5636 Views
  • 3 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

This issue occurs because Databricks does not support applying row filters or column masks to external tables when path-based access is used. While you are able to set the row filter policy on your table with no immediate error, the limitation only b...

  • 0 kudos
2 More Replies
Harun
by Honored Contributor
  • 3860 Views
  • 1 replies
  • 0 kudos

Inquiry Regarding Serverless Compute Operations After Cloud Account Suspension

Hello Everyone,I am currently benchmarking the new serverless compute feature and have observed an unexpected behavior under specific circumstances. During my benchmarking process, I executed two notebooks: one utilizing serverless compute and the ot...

Harun_0-1723183858489.png Harun_1-1723183868878.png
  • 3860 Views
  • 1 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

Serverless compute resources in Azure Databricks and Azure SQL can operate independently of your cloud subscription state because they are fully managed, abstracted services that run on infrastructure controlled by Azure rather than your own cloud ac...

  • 0 kudos
databricks8923
by New Contributor
  • 3969 Views
  • 1 replies
  • 0 kudos

DLT Pipeline, Autoloader, Streaming Query Exception: Could not find ADLS Gen2 Token

I have set up autoloader to form a streaming table in my DLT pipeline,  import dlt@dlt.tabledef streamFiles_new():        return (            spark.readStream.format("cloudFiles")                .option("cloudFiles.format", "json")                .op...

  • 3969 Views
  • 1 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

Your error suggests that while your DLT pipeline works for materialized views (batch reads), switching to a streaming table using Autoloader (readStream) is triggering an ADLS Gen2 authentication failure, specifically "Could not find ADLS Gen2 Token"...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels